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Bemripfimt 

Low Bit Esto Audio Encoding And jpecpdirag 

5 Tech&md Metd 

The Invention relates generally to audio signal processing. More partttsalarlyv 
aspects of the mventton relate & an encoder (or eaexidjng psjfcess}^ a .■decoder (or 
decoding processes}, sad to as encode/decode system (Or encoding/deeodi ng process) for 
audio signals with a very low blitoto to which a ptealsty of audio channels is represented 

1 0 by a composite monophonie audio chapel m& ancillary ("skieehaia") udTonoatioa 
Alternatively, the plurality of -audio channels as© represented fey a piorahiy of sskHg 
channels and sideehato iaft>nnaUoo. Aspsds of the tovenhon also relate to a 
multichannel to composite raonopheni'e Channel dowa««xerCor downnkx process), to a 
smajopliosMO channel to 5»«'}Hchaoaei vpnhxer (or aptoixer process), sod to & rapnophonie 

IS channel to multichannel decorreiato* (or decoirekhon pxoeess}. Other aspects of fee 
invention relate to a multichannel to mxtMiiiutmsi dowsnolxer (or dowamix process), to a 
mhliiehansel to oatoiehanoei tspouxer (or apmix process), and to a deeoreelstor (or 
deeon - eMiod process), 

Msikgrcuiftd A it 

20 U the A0-3 digi tai audio encoding and decoding system, channels rosy he 

selectively combined or ''coupled" at high fi-eqaeaeies when the system, becomes starved 
for hits. Details of the AC-3 system are well kaerwn in ifee art- see, for example:. ATSG 
&mdkt4A$i/4; Digital AudU> Compression Standard (AC- 3), MnvisiotiA, Advanced 
Television Systems Committee, 20 Aug. 2081 . The A/52A daaaam&U available on the- 

25 Wetfd 'Wm¥i& at ^p;^www.Msc.org/stendards.himh The -A/52 A doeenieat i$ heroby 
Sooorpoxated fey reference in Us entirety. 

The fteigneney above which the AC-3 system Combines chasaefe oa demand is 
referred to as tins "coupling" ftssjuaaoy. Above the eoapllag ksqamay, the eonpted 
channels are combined into & "coupling" or csmfXJSite chsnnel The encoder generates 

30 "coupling coordinates" (sospUhtde scale feelers) for est* subfesmd above toe coopling 
frequency ia each channel The coupling eeordfeisies indicate the ratio of the original 
energy of each ootmled ahaune! subbaad to the energy of toe corresponds sobhand Is 
she -composite channel. Below the coupling freqrrenoy, channels are encoded discretely. 
The phase polarity of a coHpled aha«nel*s 8sbba«d KKty fee reversed before the channel is 

35 
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£~ component cancellation. The composite channel along whh sidedmn information that 
includes, cos a per-snbhand basis, ^■tix^^g-^^t^M^'.^ 1 ^- whether the channel's 
phase is inverted, ate sent to the decider. M practice; iBie-:60f«tjpIia§--fi^qtJ^Rcie» '^nploysd 
in commercial embodiments of&»:A€4 ByBtmnhmetmg&d from shout 10 kHz to about 
5 3500 Hz. US. Patents 5,583,963; 5,727,1 5,909,664, and 0,021,386 

include teacMngs that relate to the combining of multiple aadio channels into a. composite 
channel and auxiliary or sideehaia information and she ijecovejy therefrom of an 
approximation.*© the original multiple channels. Each of said patents is hereby 
incorporated by reference in its entirety; 

Aspects of the present invention may be viewed as Improvements upon the 
"coupling"' techniques of the AC--3 encoding and decoding system and also upon other 
techniques to which multiple channels of audio are combined either to a monophbnie 
composite signal or to multiple channels of audio along with related auxiliary iufenatlon 

i,5 and from winch multiple channels of audio are reconstructed. Aspects of the present 

invention also tnay he viewed as improvements upon techniques for dowomixihg multiple 
audio channels to a .monophonic audio signal or to multiple audio channels and for 
decottelaiing multiple audio channels derived from a monophonic audio channel oriforo 
multiple audio channels;, 

id Aspects of the invention may be employed in an Mil ;N spatial audio coding 

technique (where ie N" is the number of audio channels) or an Mil :N spatial audio ending 
technique (where V is the number of encoded audio channels and I s is the number of 
decoded audio channels) that improve on channel coupling, by providing, among other 
things, improved phase compensation, deconrelation mechanisms, signal dependent 

15 variable time constants, and more compact amplitude representation. Aspects of the 
present invention may also he employed in N::x :N and M:x:N spatial audio coding 
techniques wherein "x" may be 1 or greater than L Goals iaehufe the reduction of 
coupling cancellation artifacts in me encode process by adjusting inteixhanncl phase shift 
before downmiamg, and improving the spatial dimensionality of the reproduced signal by 

30 restoring &e phase angles and degrees of deeorrelation In the decoder. Aspects of the 
invention when embodied in pmetleal embodiments should allow for continuous rather 
than on-demand channel coupling and lower coupling ireensneies than, for example in 
the AC-3 system, thereby reducing the required data rate. 
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FIG. 1 is m idealized block diagram stowsng the principal functions or devices of 
aa M; J encoding arrangement etsbodyiag aspects «f she present invention. 

FIG, 2 is an idealized block diagram showing the principal Hmetioios or devices of 
5 a 1 ;.N decoding- arrangement embodying sheets of the present. invention. 

MO, 3 shows an example of a simplified conceptual organization of bias and 
mkfomh along a (vertical) feqoeaey axis and Mocks and a frame along a {horizoaia)) 
time axis. The .8 guar© is not to scale. 

FIG. 4 is in the nature of a hybrid flowchart and functional block diagram 

1 0 showing encoding slops or devices performing moetions of m e»codtag arrangement 

embodyiag aspects of the present mvenboit. 

FIG. 5 is hi the nature of a hybrid flowchart and functional block diagram 
showing decoding steps or devices performing functions of a decoding arrangement 
embodying aspects of the present invention, 
iS FIG, 6 is an idealised block diagram showing the principal fenetieas or devices of 

a first 1S3;x encoding arrangement embodying aspects of the present invention. 

Ft®. 7 Is en idealised block diagram showing the principal functions or devices of 
aa x iM decoding arrangement embodying aspects of the present invention. 

FK3. 8 is an idealised Mock diagram showing the principal ftmctioas or devices of 
M a first atenniive xiM decoding arrangement embodying aspects of the present invention. 

FIG. 9 is an idealised block disgnnn snowing the principal ranotions or devices of 
asefcond alternative x:M decoding arxaftg^Jefttseisifeodying sheets of the pwm& 
invention, 

FIG. 1 Os is an idealized block diagram showing the principle functions or devices 

2 5 of an augmented mono/stereo encoder or encoding inaction according t o aspects o f the 

invention, 

FIG. 1 Ob is an idealised block diagram showing aa altetnative to the arrangement 
of FIG. 10a in which the a mairix^nooding do%vamixer or downmkmg function is 
employed, 

30 Fid, 1 1 a is an idealized binttk diagram sisowing the principle- fxme&am or devices 

of an alternative augmented mono/stereo encoder or encoding function according to 
aspects of the invention. 

FIG.. 1 lb is snidealized block diagram showing an alternative to the arrangement 



WO 2;m"/Hi<ms 



of FIG. 10a in which the a jn&trix-encodiag ^e i «wamixfir.er dnwranixiag fanetion is 
employed 

FfG. 1 2a is m. idealised block diagram .showing tfe&ipiiaeip!© functions or devices 
of an alternative augmented mono/sieroo decoder er decoding fenetlon according to 
5 aspects of the invention. 

FIG, 12b is an idealised block diagram showing an alterative to the awangement 
of FIG, 10a in which the & matrix-decodiag npsmsi&x or upmssdng function is employed, 

FIG, 1 2c is an idealized block diagram showing an alternative to the arrangement 
of HO. i Oa in which simulated spatial parameters for ase above the coupling frequency 
1 0 arc derived from the stereophonic portion of She hybrid stereophome/monophoxnc below 
the coupling frequency. 

Best Mode far Carrying out the Invimikm 
Basic N':J Encoder 

Referring to FIG. 1, an N:l encoder function or device embodying aspects of the 
I § present Invention is shown.. The figure is m example of a function or structure dua 
performs as a basic encoder embodying aspects of the invention. Other functional or 
strnebmd arrangements that practice aspects of the Invention may foe employed, metudhtg 
alternative and/or equivalent functional or structural arrangements described below. 
Tv/O or more audio input channels are applied to the encoder. Although, in 
20 principle!, aspects of the invention may be practiced by anal eg, digital or hybrid 

analog/digital Embodiments, examples disposed herein are digital embodiments. Thus, 
. the input signals may be time samples that may have been derived from analog audio 
signals. The time samples may be encoded as linear pulse-code modulation (PCM) 
signals. Each linear PCM audio input channel is processed by aMerbaak Amotion or 
25 device having both an hvphase asm a quadrature output, such as a 5 I2~poiot windowed 
forward discrete Fourier transform (DFT) (as implemented by a Fast Fourier Transform 
(FFT)}> The fbtsrbank may be considered to be a ihne-domain to :frc>:meocy~domaia 
transform. 

PIG, I shows a fbst PCM efeannel input (eharmel "l 5 -} applied io a fiherhaok 
30 fimeboa or device, "interbank" 2, and a second PCM channel input (channel V) applied, 
respectively, to another fiherbanfc funcdon or device, "Slterinmk" 4. There may be'V* 
Input channels, where "a" Is a whole positive Integer equal to two or more, Tims, there 
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also are "if> illterbanks, each receiving a aolqae cm: of the *u f * input channels. For 
simplicity is presentation, FIO. if shows only two input ^ahbels, w t * sod 

When a ffiterbaok is tepl«se»rsd fey ao FFT f sisals are usually processed in 
overlapping blocks and the FPT's discrete frequency outputs {transform coefficients) are 

5 referred to as bios, sacb having a complex value with real aad imaginary parts 
corresponding, respectively, to in~phaxe sod quadrature o&mposieMs, Coritlgnotis 
transform fema may be grouped into sufebahds approximating critical handwidths of toe 
human ear, and roost sidecfeain informatkm produced fey the encoder, as will be described, 
may be calculated and transmitted on a pcT-subfeand basis ia order to minimize processing 

iO resources and to reduce the bit rate. Multiple successive blocks may be grouped into 
$&m*S4'wita individual block values averaged or otherwise combined or accumulated 
across each fesoe, to minimize die si'deeliam data rate. In examples described herein, 
each interbank is implemented by an FFT, eontiguoas transform bins are grouped into 
sobfeandsybloeks are gmaped into frames and sideohaio data is sent on a once fcr-iraioe 

IS basis, Alternatively, sidechain data may fee sent on a more thm ©see per fmm basis, 
Obviously, there Is a tradeoff between the frequency at which sidechain information is 
sent sod the required nitrate. 

A suitable practical implementation of aspects of the present inventi on may 
employ fixed length frames of about 32 milliseconds when a 48 kilz sampling rate is 

20 employed, «ach frame having six blocks of about 5,3 milliseconds each. However, 

neither such timings nor the employment of fixed length frames nor their division into a 
fixed number of blacks Is critical to practicing aspects of the invention provided that 
information described herein as being sent on & per-fiame basis is sent about every 20 lb 
40 milliseconds. Frames may fee of arbitrary size and their size may vary dynamically. 

IS Variable block lengths may he employed as in the AC-3 system cited above, it is with 
that understanding that reference is made herein to "frames" and "blocks.** 

in practice, if the mono composite signal or die mono composite signal and 
discrete low-frequency eharmeisare pe3«eptually encoded, as described below, it is 
convenient to employ the same frame and block ee-n^jration as employed in the 

30 perceptual coder. 

FIG. 3 shows m example of a simplified conceptual organisation of bins and 
subheads along a (vertical) freipe^cy axis and Mocks sod a ftame along a (horizontal) 
time axis. When bins are divided into snbbands mat approximate critical bands, fee 
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lowest frequency $m?baods have the fewest him one) and the number ofbins per 
subfcand increase with increasing fn^jueocy. 

Returning to FIG. I, a freqti^ey-domsib vesrsmis of each of the n time-domain 
input dsaunels, produced by the each ch^mel's respective frltefounk (fSterbanks 2 and 4 
5 in this example) axe summed together ("dowsmttxed"} to a mouophonle ("mono") 
composite audio signs! by as additive combiner 6. 

Hie downmhdng may bo applied to the entire frequency baadwidth of the Input 
audio signals or, optionally, h may he limited to frequencies above a given "coapKng" 
frequency, inasmuch as artifacts of the dowaonxing process may become more audible at 

10 middle to low frequencies. Is such eases, the channels may be conveyed discretely below 
the coupling frequency. Such an arrangement is described below m connection with die 
examples of FIGS. 10, 11 and 12. This strategy may be desirable eves if pt ocessing 
artifacts are not an issue, in that mid/low frequency subbands constructed by grouping 
frsnsform bins into critscol-baud-Hke subbands (size roughly propordonal to frequency) 

I S tend to have a small number of ttsnsfotm bins at low tfuquenoies (one bin at very low 
frequencies) and may be directly codes! wills as few or fewer bits than is requited to send 
a dowhmixed mojso audio signal with skiechain information, in a practical embodiment, 
of aspects of the present invention, a coupling frequency as low as 2300 Hz has been 
Jfound to fee suitable. However, the coupling frequency is not critical, and lower coupling 

20 frequencies, even a coupling frequency at the bottom of the frequency baud of the audio 
signals applied to the encoder, may he acceptable for some applications, particularly those 
in which a very low bit fate is important. 

Before downmi%ing ;i ft is an aspect of the present invention to improve the 
channels* phase angle alignments vis-a-vis each other, in order to reduce the cancellation 

25 of out-of-phase signal components when the channels are combined and to provide an 
fcnpmved mono composite channel. This m&y be accomplished by controllably shifting 
over time the "absolute angle" of some or all of the transform bins in ones of the 
channels. For example, alt of the tmnsfbxm bios representing audio abo ve a. coupling 
frequency, thus defining a frequency band of inieresi, may be enntroilabiy shifted over 

30 time, as necessary, in every channel of, when one eharmel is used as a reference, in all hut 
the reference channel 

The "absolute angle" of a bin may be taken as the angle of the magnitude-arid- 
angle representation of each coxapiem valued iransfoTru bin produced by a fiiterhank. 
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CqntroJkble shifting of the absolute angles of bins in a ehaaoel is performed by an angle 
rotation Inaction or device f 'rotate angle"}* feotaie angle B processes the output of 
futerbank 2 prior to its application to med&wnmi:& summation 6, while rotate angle 1 0 
processes the ontpot of interbank 4 prior to its appiicafaon to the dowamtx sonnn&bon 6. 
5 ft will be appreciated that, under some signal eonditjens, no angle rotation may be 

required for a part icular transform bin owr a time period (the time pm®& of a frame, .in 
examples, described heMn> Below fee coupling frequency, the channel mfopaaftoa may 
be encoded discretely (not shown to HO, t; see, for example^ the examples of FIGS, 10 
and ll s below), 

1 ® hs principle, aa improvement in the channels' phase angle alignments with respect 

to each other may be accomplished by phase shifting every transform Ma or suhband by 
the negative of its absolute phase angle, in each block throughout She frequency baud of 
interest. Although this substantially avoids cancellation of oot-of-phase signal 
components, it tends to cause artifacts; that may he audible, particularly if the resulting 

iS ffiond composite signal is listened to in isolation. Thus, it is desirable to employ thfe 
pincipk 6f^mt:tt»^eet M .byj5feia:mg die absolute angles of bins in a channel only m 
much as necessary to minimize oubof-phase cancellation in the dowmrsix process and 
minimize spatial image collapse of the multichannel signals reconstituted by me decoder. 
A preferred technique for determining such angle shift is described below. 

50 Energy nonuaikahon may also he performed on a pen-bin basis in the encoder to 

reduce further any remaining ouMvf-phase cancellation of isolated bias, as described 
ftntner below. Also as described former below, energy nortnahaaiion may also be 
performed on a per-snhband basis (in the decoder) to assure that the energy of the mono 
composite signal equals the sums of the energies of the contributing channels. 

5 Bach input ehansel has an audio analyzer function or device ("audio analyzer") 

associated with it for generating the aidochsdn information for that channel and for 
controlling the amount, of angle mmjon #fpli«d to the channel before it is applied to the 
dowrmdx summation 6. The filterbash outputs of chaonels 1 and n am applied to audio 
analyzer 13 and to audio analyser 14, resneciively. Audio analyze :| 2 generates the 

0 sideehain Information for channel 1 ^M*-mmsm of angle rotation for channel 1 . 
Audio analyzer 14 generates the sideehain intorination for channel n and the amount of 
angle rotation for channel a. 
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The sideohahi infoB&aftoa for sash channel generated by an apllo mi&ym for 
each channel may l»chi&e: 

an Amplitude Scale Factor f 'AmpHiadc SF% 
an Angle Control Farametesr, 
5 a Beeorrslation Scale Factor <"Decoffelaii»n SB*), and 

a Transient f lag. 

la each case, the skleehain Information applies to a single subband (except for the 
Transient Flag, which applies to all subbands within a channel) and may be updated ottce 
per fiane as in die examples described befow. The angle rotation for a particular channel 

1 0 in the encoder may be takes as the polarity-reversed Angle Control Parameter that forms 
part of the sklechato info nashon. 

if a reference channel is employed, that channel may not reqxdre an audio arralyxer 
or* alternatively, may require an audio analyser that generates only Amplitude Scale 
Factor sideeftab mfosaiailon. It is not necessary to send an Amplitude Scale Factor if 

15 that scale factor cm be deduced with sufficient accuracy by a decoder from the 

Amplitude Scale Factors of the other* non-refsaxsnee, channels. It is possible to dednce in 
the decoder the approximate value of the reference channel's Amplitude Scale Factor If 
the energy bbrtnalizatioa in the encoder assures that the scale factors across channels 
within any snhhand substantially sum square to % as describe below. Use deduced 

M approximate reference channel Amplitude Scale Factor value may have errors as a result 
of She relatively coarse qnaattzation of amplitude scale factors resulting in image shifts in 
the reproduced mnhl-charrael audio. However, in a low data rate environment, such 
artifacts may be more acceptable fhm issiag tbe bits to semi the reference channel's 
Amplitude Scale Factor. Nevertheless, in; some eases it may be desirable to employ an 

25 audio analyser for the reference channel that generates, at least. Amplitude Scale Factor 
sklsehain information 

FIG* I shows in a dashed late m optional input to each audio analyzer from the 
FCM time domain input to the audio analyser in the cbanneL This input may be used by 
the audio analyser to detect a transient over a time period (the period of a block or frame, 

30 in the examples described herein) and to generate a transient indicator (<?.g., a one-bit 

"Transient Flag") in respons« to a transient Alternatively, as described below, a transient 
may be detected m fee frequency domain. In which case the airdio analyze need not 
receive a time-domain input 
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The mom composite eudio signal and the sid^ala i«fe«csti«i3 for all the 
©haaaels {or all the cfca»»e!a- except ifa&gesmm ehajsnei) may be stei transmitted, or 
stored aad transmitted to a decoding prosed of deviee e'deeoder"). Preliminary to the 
storage, araasmisskm, or stom^st'm^-^^t^s^m, &e' various -audio signal and various 

5 skiechain iufonnstios may be multiplexed and packed into one or more hi istreams 

suitable for the storage, transmission or storage and feransttJisstoa-tnedium or media. The 
jBono composite, audio may fee applied to a data-rate reducing encoding process or device 
such as, for example, a perceptual, encoder or to a perceptual encoder and an entropy 
coder (jg.g» antbmetb or Huffman coder) (sometimes referred to as a; 'lossless*' coder) 

iO prior to storage, transmission, or storage and traaxodssiom Also, as raentkased above, the 
mono composite audio add related sideeham kdbimaiion may be derived from multiple 
input channels only for audio tmqueneies above a certain frequency (a "coupling** 
&eqpeney}. In that ease, the audio frequencies below the coupling frequency in each of 
the multiple input channels may be stored,, transmitted or stored and transmitted as 

15 discrete channels or tsnry fee combined or processed in some manner other than as 

described herein, Such discrete or otherwise-eombined channels may also be applied to a 
data reducing encoding process or device such as, for example, a perceptual encoder or a 
perceptual encoder and an entropy encoder, lite mono composite audio and the discrete 
sntnhch&nnel audio may all be applied to an integrated perceptual encoding or perceptual 

IQ and entropy encoding process or device. As described below in connection with 

examples of FIGS. 10, 1 1 and 12, the various sideohain information may be carried in 
what would otherwise have been unused hits or stegaoographieally in an encoded version 
of the audio imbnnahon . 

Basic J;N and J ;M Deeader 

25 Refemag to FK3, % a decoder mneiiou or device C'deooder") embodying aspects 

of the present invention is shown. The fispnris an mampte of a function or structure that 
performs as a basic decoder embodying aspects of the invention. Other functional or 
stnsctoral arrangements that practice aspects of the invention may be employed, including 
alternative and/or equivalent hmehonal er sunottwal arrangements described below. 

3 0 The decoder receives the mono eoxnp&rite audio signal and the sldeehain 

iuiurraation for all the channels or all the uhaoosis except the reference channel. If 
necessary, the composite audio signal and related sldecham Information is demultiplexed, 
impaeked and/or decoded. Decoding %my smptey a table lookup. The goal is to derive 
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,%>m the mono composite audio channels a plurality of individoal audio channels 
approximating respecti ve ones of fee audio etuushels applied to the encoder of PK3, 1 , 
subject to hitrate--redoeiug ; techniques of the present invention that are described herein. 
Of course, one may choose nolto recover ail of the channels applied to the 

5 encoder or to use only the motmptmnle composite signal. Alternatively, channels In 
udditloii to the ones applied to the encoder may be derived from the output of a decoder 
according to aspects of the present Invention by employing aspects of the laventioas 
described m Inieraatioaal Application PCT/US 02/03619, fded February 7, 2002, 
published August 15, 2002, designating &e United States, and its resnitlng U.S. nahonnb 

10 application 1 0/46t7,a? 3, filed August 5, 2003, sad in International Application 

PCT/BS03/24S70, filed August 6» 2003, published March. 4, 2(H) t as WO 2004/01 9656, 
designating the United States, Said applications are hereby incorporated by reference In 
their entirety. Channels recovered by a decoder practicing aspects of the present 
invention arc particularly useful in connection with the channel middpheanou techniques 

I S of die cited and incorporated applications in that the recovered channels not only ha^e 
useful interehanne? amplitude relationships but also have useful interchannel phase 
relationaihips. Another alternative is to employ a matrix decoder to derive additional 
channels, See, for example, the examples of FIGS, 10, 1 1 and 12, below aad thesr 
descriptions. The interehanne? amplitude- and phase-preservation aspects of the present 

20 In vention make the output channels of a decoder embodying aspects of the present 

invention particularly suitable for application to an amplitude- and phase-sensitive mania 
decoder. For example, if the aspects of the present invention arc embodied In an N: l ihi 
system ia which N is 2, the two channels recovered by the decoder may be applied to a 
2:M matrix decoder. Many suitable matrix decoders are well known in the art, including, 

25 for example, matrix decoders known as "Fro Logic" and "Fro Logic IF* decoders ("Pro 
Logic- * is a trademark of Dolby Laboratories Licensing Corporation} and matrix decoders 
embodying aspects of the subject mate" disclosed mono o$ more of the following U.S. 
Patents and published international AppHeations (each designating dre United States), 
each of which is hereby incorporated by rotasce in its entirety: 4,799,260; 4,94.1 ,177; 

30 5,O4&098; 5,274,740; 5,400,433; S,d2S,696; 5,644,640; 5,504,819; 5,428,687; 5,172,415; 
WO 01/41504; WO 01/4 P05; and WO 02/19768, 

Kefering again to FIG. 2, the received mono composite audio channel is applied 
to a plurality of signal paths Scma which a respecti ve one of each of the recovered 
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multiple audio channels is derived* :j^dhi^^^edyi^i}aife. : |»ci»d^ in either order, 
an amplitude adjusting function or device ("adjust anmhtude") and m angle rotation 
function, or device ("rotate angle"). Tfee Adjust Aapiitode is intended to restore the 
amplitude Cor energy) of the received mono composite signal relative to the amplitude (or 

5 energy) o f each of the other recovered channels to m amplitude (or en ergy) sim ilar to the 
original amplitude (or energy) of the channel relative to the other channels at the input of 
the encoder. The Rotate Angle is intended, for certain signs! conditions* to restore the 
angle of the received mono composite signal relative te the aagle of each of the other 
recovered channels to an angle similar to the original angle of the channel relative to the 

! 0 other channels at the input of the encoder. Preferably, under certain signal conditions* a 
controllable amount of pseudo-random angle variations is also imposed on the angle of a 
recovered channel in order ie improve its deeorrektioa with respect to other ones of the 
recovered channels. Conceptually, the adjust amplitude and rotate angle functions lor » 
particular channel scale the mono composite audio DFT coefficients to yieM tmnsform 

t$ bin values for the channel 

The Adjust Amplitude for each channel may he controlled by the recovered 
sideehaih ■.■Amplitude Scale Factor for the particular channel- or, in the case of the 
reference channel, either from the recovered sideeoam Aamlhude Scale Factor for the 
reference channel or fern an Amplitude Scale Factor dedoced from the recovered 

20 sideeh&fn Amplitude Seale Factors of the other, non-reference, channek. The Rotate 
Angle for each channel may he controlled at least fey the recovered sidecham Angle 
Control Parameter (in which ease, the rotate angle in the decoder substantially undoes the 
angle rotation provided by the rotate angle in the encoder). To enhance decorrelation of 
the recovered channels,, a Rotate Angle may also he controlled by a Pseudo-Randans 

25 Angle Cortisol Faranseter derived fern the recovered sideeham Deeorrehdioo Scale 
Factor for a particular channel and the recovered sidecham Traosient Flag for the 
particular channel. The Pseudo-Random Angle Control Parameter for a channel may he 
derived from, the recovered De«>rrektbn Scale Factor for the channel and the recovered 
Transient Flag tor the channel by a controllable decollator fhnetion or device 

30 ("Conhohabk Decorrelatof 

Referring to the example of FTCh 2, the recovered mono composite audio is 
applied to a first channel audio seco>^^h.2^ : ^^4snves the channel \ audio, and 
to a second channel audio recovery path 24, vvhieh derives the channel n audio. Audio 
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path 22 fectudes ass adjust amplitude 26> afotat© angle 2% ami if a PCM output is 
de.slred s an inverse Mferbauk 30, Similarly, audio patb 24 includes an adjust amplitude 
32, a rotate angle 34, and, if a PCM output is desired, an isweme ftiierbaak 36, As with 
the case of FIG, I, only two channels are shown for sjjnpBeity in presentation, it being 
5 understood that there may be- mote thm two channels. 

The recovered sideohaln mfonna&oh for die first channel,, channel I, may include 
so Amplitude Scale Factor,, an Angle Control Pansmeter, a Decorrelation Scale Factor, 
and a Transient Flag, as stated above in connection with the descnptlon of a basic 
vocoder. The Amplitude Scale Factor Is applied to adjust amplitude 26. The Transient 

10 Flag and Deeorsrelatton Scale Factor arc applied to a controllable deeoxrelator 38 that 
generates a Pseudo-Random Angle Control Parameter m response thereto. The Angle 
Control Parameter and the Pseudo-Random Angle Control Parameter are saramed 
together by an additive combiner or combining function 40 in order to provide n control 
signal .tor Rotate Angle 2S. 

15 Smiilarly, recovered sideehaia information for the second channel., channel n, may 

also include an ■■Amplitude Scale Factor, an Angle Control Parameter,, a Deeorrelatiou 
Scale Factor, and a Transient Flag, as described above in connection with the description 
of a basic encoder. The Ampli tude Scale Factor is applied to Adjust Amplitude 32, The 
Transient Flag and Decorrelation Scale Factor are applied to a controllable deeotraiator or 

20 deeotteiator function ("Controllable DecorreMoP^) 42 that generates a Pseudo-Random 
Angle Control Parameter in response thereto. The Angle. Control 'Parameter and the 
Pseudo-Random Angle Control Parameter are summed together by an additive combiner 
or combining function 44 in order to provide a control signal for Rotate Angle 34. 

Although a process or topology as just described is useful for understanding, 

25 essentially the sanse results may be obtained with alternative processes or topologies that 
achieve the same or similar results. For example; the order of Adjust Amplitude 26 (32) 
and. Rotate Angle 28 (34) may be reversed and/or there may be more than on© Rotate 
Angle ~~. one that responds to the Angle C^nf rpl Parameter and another that responds to 
the Pseudo-Random Angle Control Parameter. Tim Rotate Angle may also be considered 

3D to ha three rather than one or two functions or devices, as In the example described below. 
If a reference channel is employed^ as discussed above In connection with the 
basic encoder, the Rotate Angle, Conbmhahie Becorrelator and Additive Combiner for 
that channel may be omitted iaassnoch as the sideehaio information for the reference 
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diatmei may irteiode only the Amplitude Scale Taetor (os:, dtomatively, if the sidedbain 
information does not contain mi Amplitude Seale Factor for fee reference channel, it may 
be deduced from Amplitude Seals Factors of fee? other channels when the energy 
normalhmfion in the encoder as^^ite^'scate.&ctejsvaci»s8 channels withia a 

5 sub-band sum square to 1), A» Amplitude Adjust is provided for the reference channel 
aad it is controlled by a received or derived Amp Mfcude Scale Factor Sbrtfee reference 
channel Whether the reference channel's Amplitude Scale Factor is derived femi fie 
skteefcabi m is deduced in the decoder, the recovered reference channel is an ajnplimde- 
scaled version of the mono composite ehanneh It dees net ieqahe angle rotation feeeaasse 

10 it is the reference for the other channels* rotations. 

Although adjrssfcmg the ireiahve amplitude of recovered channels rosy provide a 
modest degree of decollation, if used alone ss»pUt«de adjustment is likely to result in a 
repmdueed soundfseld substantially lacking in spatklisatiou or imaging for many signal 
conditions (eg., a "collapsed" soondfteid). Amplitude adjustment may affect iniersnral 

i 5 level dlfferenees at the ear, which is only one of the psyehoaeoustjc directional ones 

ernplpyed by the ear, Thus, according to aspects of the invention, certain angle-adjusting 
tsehnignes may he employed, depending on signal conditions, to provide additional 
decollation, 'Reference may he made to Table I that provides abbreviated comments 
useful in understanding angle-adjusting decollation techniques that may be employed in 

20 accordance with aspects of the invention. Other decorrelahkm techniques as described 
below in comsection with the examples of FIGS. 8 and 9 maybe employed instead of or 
in addition to the techniques of Table L 

Table 1 



I } Technique 1 


Technique 2 


Technique 3 i 


\ Ty pe of Signal 
1 (typical example) 


Spectrally static 
source 


Complex continuous 
signals 

Deeorrelatos non- 
impulsive complex 
signal components 


Complex impulsive j 
signals (transieohs) j 
Decorreiates > 
impulsive high I 
irequeney signal j 
components 


[ Effect on 
Deeorreladon 


Deeorreiatas low 
frequency and 
steady-state signal 
components 


| Effect of transient 
present in frame 


Operates with 
shortened time 
constant 


Does not operate 


Operates 


| What is done 


Slowly sards 
( fi ame-by-hame) 
bin angle br a 
channel 


Adds to the angle 
shift of Technique 1 
a pseudo-random 
angle shift on a bin- 


Adds to the angle j 
shift of Technique 1 j 
a rapidly-changing j 
(hi ock hyMock) j 
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Technique I 


TeshSqye2 


Technique § ] 






by-bin basis in a 


pse«do~randoni 






channel 


angle shut on a \ 








subba*id~by~svsbband § 








hasrs m a ouasmei \ 


Controlled by or 


Degree of basic shift 


Degree of additiona l 


Degree of addi tional 


Scaled by 


is controlled by 


shift is sealed 


shift is scaled. 


Angle Control 


directly by 


indirectly by 




Parameter 


Decorreiatfon. SP; 


Decorrektiors SF; 






same sealing across 


same sealing across 






snhband, sealing 


subbaad, scaling 






updated every feffis 


updated every frame 


Frequency 


Snbband (same or 


Bib (different 


Snhband (same 


Resolution of angle 


interpolated shift 


pseudo-random shift 


pseudo-random shift 


shift 


mlii® applied to all 


value applied to 


value applied to all. 




bins in each 


eacn win} 






snhband} 




sub-hand; different 








psendo- random shift 








value applied to 








; eaeh snbband in 






Fseudo-raudoiB shift 


:■ e&aoael| 


Tins© Resolution 


1 Frame (shift values 


Block (pseudo- 




! updated every 


va-foes vmmixi the 


r&ndons shift values 




frame) 


same and do not 


updated e ver)' 




change 


block) 



For signals that airs substantially static spectrally ,, such as, for example, a pitch 
pipe note, a first technique f Technique P) restores the angle of the received mono 
composite signal relative to the angle of each of foe other recovered channels to an angle 
5 similar (subject to frequency and time granularity a«<J to tjtiantizatjoa) to the original 

angle of die channel relative to the other channels at the Input of the encoder. Phase angle 
differences are useful, particularly, for providing decorrelation of 3ow~frequs»cy signal 
components below about 3 500 Hz where the ear follows individual cycles of the audio 
signal. Preferably, Technique 1 operates Under all signal conditions to provide abasia 
10 angle shift 

For hi gh-frequeney signal components above about 1 S00 H%- ihs ear does not 
follow Individual cycles of sound but Instead responds to waveform envelopes (on a 
critical band basis). Hence, above at>o«t 1500 Mz deeorrelahon is better provided by 
differences in signal envelopes rafeer than phase angle differences. Applying phase angle 
1 5 shifts only in accordance Mm Technique 1 does not alter foe envelopes of signals 
sufficiently to deecarelate high ftcqasacy signals. The second and third ieclmkpscs 
("Technique 2" and 'Teehmqee 3% «speotiveiy) add a .-controllable amount of pseudo- 
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random angle variations f© the Migle determined tsy Technique I under certain signal 
conditions, thereby calling a controllable amount ©fpse*ule~4undom envelope variations, 
which enhan ces decollation. PrefemMy* a csatroliabfe degree of Technique 2 or 
Techniques operates along wjtb Tecbfjique 1 ^nder certak* signal conditions. 

5 Technique 2 is suitable for complex eonimuoes signals that arc rich in harnionics, 

such as massed orchestral violins. Technique 3 is suitable for complex impulsive or 
transient; signals, such as applause* castanets, etc. (Technique 2 time smears-claps In 
applause, making it unsuitable for s«ch signals). As explained further below, m order to 
minimize audible artifacts, Technique 2 end Technique 3 have different time and 

i 0 frequency resolutions for applying pseudo-random angle variations — Technique 2 is 
selected when a transient is not present, whereas Technique- 3 is selected when a transient 
is present. 

Technique 1 slowly shifts (frame by feme) the bin angle its a channel. The degree 
of this basic shift is controlled by the Angle Control Parameter (no shift if the parameter 

15 is ssero). As explained ihrtker below, either the same or m interpolated parameter 1 Is 
applied to all bios in each subbaod and the parameter is updated every frame. 
Consequently, each subband of each channel may have a phase shift with respect to other 
channels, providing a degree of decorrelation at low frequencies (below about 1500 Ha). 
Howeyer, Technique I, by itself, is unsuitable for a transient signal such as applause >'0r 

10 such signal conditions, the reproduced channels may exhibit an annoying unstable comb- 
ftlter effect. In the easfe of applause, essentially no decorreladon is provided by adjusting 
the relative amplitude of recovered channels because al l channels tend to have the same 
amplitude over the period of a frame,. 

Technique 2 operates when a transient is not present. Technique 2 adds to the 

25 angle shift of Technique 1 a pseudo-random angle shift that does not change with tithe, 
on a bin-by-bin basis (each bin has a diHetent pseudo-random shift) in a. channel, causing 
the envelopes of the channels to he differem ftem one ans5ther s thus providing 
decerrelailon of complex signals among the diannels. Maintaining the pseudo-raadom 
phase angle values constant over time avoids block or frame artifacts that may result from 

30 block-to-block or -famisJx^fh^Ws^W »f bin phase angles. While this technique is a 
vary useful deeorrelation tool when a transient is not present, ft may temporally smear a 
transient (resulting in what Is often referred to as "pre-poise" - die post-transient 
sane&ring is masked by the ftarisient). The degree of additional shi ft provided by 
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Teclmhpe 2 is scaled directly by &e Deeorrelafen Scale Baetor {there is bo additional 
shift if the scale factor Is zero). Ideally, the amount of p^do-ysftdpm phase angle added 
to the base angle shift (of Technique 1) according to Technique 2 is controlled by the 
Decorrelation Seal© Factor m a madder that avoids audible signal warbling artifacts, 

5 Although a different additional pseudo-random angle shift value is applied to each bin 
and thai shift value does act change, the same scaling is applied across a suited and fee 
scaling is updated every frame. 

Technique 3 operates in the presence Of a. transient, ft shifts all the bias i» each 
sabhand in a channel from block to Mock with a unique pseudorandom angle value, 

I 0 common to all bias is the subband, causing not only the envelopes, bat also the 
amplitudes and phases, of the signals lis a channel to change with respect to other 
chaaaefe from block to block. This .reduces steady-state signal sunilasities among the 
channels and provide dacerrelation of the channel s sohstaatlaliy without causing "ps^- 
^solse* , artifacts; Although the ear does not respond to pare angle changes directly: at high 

15 frequencies* when two or more channels mix acoustically on their way feoni loudspeakers 
to a listener, phase differences may cause amplitude changes (comb-filter efroets) that 
may be audifelo aad objechoaable, and these are broken up by Technique 3, The 
hnpuMYe characteristics of the signal nunimise block-rate art ifacts that might otherwise 
occur. Thus, Technique 3 adds to the phase shift of Technique i a rapidly changing 

16 (block by block) pseudo random angle shift on a suhhand-by-subband basis in a channel. 
The degree of additional shift is scaled indirectly, as described below, by the 
D<scorrelation Scale Factor (there is no additional shift if the scale factor is zero}, The 
same scaling Is applied across a subb&nd aad die sealing is updated every frame. 

Although the angle- adjusting techniques have been characterised as three 
25 techniques, this is a mailer of semantics and they may also be characterised as two 
techniques: (I) a combination oFTechaiqne I and a variable degree of Techniqiie 2, 
which may be aero, aad (2) a combination of Techaique 1 aad a variable degree 
Technique 3 , which may be zero- For convenience in presentation, the techniques are 
treated as being three techniques, 

30 

As mentioned above, the sideehaia infermation may include: an Amplitude Scale 
Factor, an Angle Control Parameter, a ^correlation Scale Factor, and a Transient Flag. 
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Such skleehain information fer a practical mtix&kemi of aspects of present invmtioa 
may be'suramsnaasd m the foUmtkigT^U%,;Tf^ic0% the sMeehain iftfonnatloa may 
be updated once per frame. 

Table 2 

3 Sideehain Mf&fife&dois GbasiaciedsSos for a Channel 



i Sklechaia 


~"1 


Represents 


Quantization 


Primary ! 


Parameter 


Valoe Range 


(is "a {ncasure 


Levels 


Purpose | 






of) 






Sob-band Angle 


6 ^¥-4-2% 


Smoothed time 


6 bit <b4 levels) 


Provides 


Control 




average across 




basic angle 


Parameter 




sahharsd of 




rotation for 






ditlbxeoce 




each bin in 




1 


between angle of 




ehaaael 






each bio |& 










subband for a 










channel and that 










of the 










eotrespending bin 










of a reference 










channel 






Subband 


0 -M 


Sfjeetr&l- 


3 bit (8 levels)"" 


Scales 


Decotreiabon 


The Subband 


s teadiness of 




peado- 


Scale Factor 


Dec^r relation 




i random 




Scale Factor is 


characteristics 




angle shifts 


| high oaly if 


over time In a 




added to 


j both the 


subband of a 




basic angle 


| Speetrah 


channel (the 




rotation 


j Steadiness 


Speeirai- 






| Factor and the 


Steadirsess Factor) 






I feterchaaoel 


ami the 








Angle 


consistency so the 








Consistency 


same subbaad of a 








Factor are lo w. 


channel of bin 










angler, with 






| 
1 




respect to 










eonrespondiag 










| bins of a reference 










[ channel {she 






! 




hiterchaoBel 










i Angle 
Consistency 










! Factor) 
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Sklecham. 
Parameter 


~T Represents 
Range | (is "a measure 
oF) 


Quantisation 
Levels 


Primary ] 
Purpose | 


Sbbbaod 
Amplitude Scale 
Factor 


0 to 31 (whole 
integer) 
0 is highest 
.amplitude 
3 1 Is lowest 
amplltude- 

_ 


jBmsFgyor 
amplitude in 
sxsbbandof a 
channel with 
respect to energy 
or amplitude for 
same subband 
across ail 
channels 


5 bit (32 lewis) 
Gramilarity is 1 .3 
dB, so the range 
is 31* 1.5 = 46,5 
dB plus final 
valoe - off. 


Scales ™1 
amplitude | 
of bias irs a | 
subband in j 
a channel j 

) 


Transient Flag 


(True/False) 
(polarity is 
arbitrary) 


Fnsameeof a 
transient hi the 
frame 


1 bit (2 levels) 


Petenmnes | 
which | 
.technique \ 
lor adding | 
pseudo- 
random 
angle shifts i 



In each esse, the sideeham information of a channel applies to a single subband 
{except ibr. the Transient Flag, which applies to all sabbands) and may be updated once 
per frame. Although the time resolution (once per Same), .frequency resolatioo 

5 pubbandX value ranges and quantisation levels indicated have been found to provide 
useful performance and a useful compromise between a. low hit rate and performance, k 
will be appreciated that these time and freqoeoey resolutions, valne ranges and 
quantization, levels are not critical and that other resolutions, ranges and levels may 
employed in practicing aspects of the Invasion, 

1 0 It will be noted that Technique 2, described above (see also Table I ), provides a 

bin frequency resolution rather than a subband frequency resolution (Le„ a diltereni 
pseado random phase angle shit! is applied to each bin rather than to each subband) even 
though the same Snbband Deeorrelabon Scale Factor applies to ail bins In a snbband. It 
will also be noted that Teelmique 3, described above (see also Table ! ), provides a block 

; 5 frequency resohhion (i.e. , a different pseudo-rasidom phase angle shift is applied to each 
Mock rather than to each frame) even though the same Sahband Deeorrehfoon Scale 
Factor applies to all bins in a snbband. Sneh resointionsj greater than the resolution of the 
sidcchain infonuatiui^ are possible because the pseudp-randorn phase angle shifts may be 
generated hi a decoder and need not be known In the encoder (this is the ease even if the 

JO encoder also applies a pseBdo-randem phase angle shift to the encoded mono composite 
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signal an alternative that is described heksw), in ether words, it Is 'not necessary to seod 
sidechai» information Having bin or block grawiarity even though the deeotxefaiEon 
techniques employ such granularity. The decoder may employ, for example, one or more 
lookup tables of pseudo-raadomiy-ohosea bin phase angles. The obtaining of time and/or 
S frequency resolutions for doeoirdatidii greater than the ssdeehain ia&rmation rates is 
among the aspects of the present invention, deootselastioa fey way ©f randomised 
phases is performed either with, a fine fepency resolution Ct«a-by4>in} diat does not 
Changs with time (Technique 2), or with a coaxse frequency resolution (band-by~band and 
a hoe time resolution (block rate} (Techniijue 3). 

IQ it will also be appreciated thai as increasing degrees of pseudo-random phase 

shifts are. added to the phase angle of a recovered channel, that the absolute phase angle 
of the recovered channel differs mote and more from the original absolute phase angle of 
that channel. An aspect of the present invention is the appreciation that the restating 
absolute phase angle of the recovered channel need not match that of She origi nal channel 

IS when signal conditions are such that die pseudo-random phase shifts are added in 

accordance with aspects; of the present invention. For example, in extreme eases wbeh 
the Decollation Scale Factor causes the highest degree of pseudo-random phase shilij 
the phase shift caused by Technique 2 or Technique 3 overwhelms the basic phase shift 
caused by Technique 1 . nevertheless., this is of no concern in that a pseudo-random 

20 phase shi ft is audibly the same as the different random phases in the original signal that 
give rise to a Decorrelatfon Scale Factor that causes the addition of some degree of 
pseudo-random phase shifts. 

Inasmuch as the Transient Flag applies to a frame, me time resolution with which 
the Transient Flag Selects Technique 2 or Technique 3 may be enhanced by providing a 

25 supplemental transient detector m the decoder in order to provide a resolution fmer than 
the frame rate or even the block rate. SrA a suppiemental transient detector may detect 
the occurrence of a transient in the mono composite audio signal recei ved by the decoder 
sad such detection information is then sent to each eorUroIlabfe decosTclator (as 38, 42 of 
FIG, 2). Then, upon the receipt of a Transient Flag fin? its channel, the controllable 

30 deeorreiator switches from Teeimiqoe 2 to Technique 3 upon receipt of the decoder's 
local transient detection mdieatiom Thus, a substantial imprnvemeut in resolution is 
possible without increasing the sldechma Hit rate- 
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■A$ an alternative to sending Mdeebaia information on a ftsj»«-'by-fi , as»« basis,, 
sidechain information may be updaleii every block, at test for highly dynamic signals. In 
order to accomplish that without suhstanuafiy increasing the sideclnun data rate, a block- 
floating-point differential eoding arrangement may fee used. For example, consecutive 

5 transform blocks may be collected in. groups of six oyer a feme. The foil sideehsin 
information may be seat for each suhband-channel m the first block, la the five 
subsequent blocks, only differential values may be sent, each die difference between the 
current-block amplitude and astgle, ead the equivaieot values fiom the previotis-biock. 
This results in very low data rate for static signals, such as a phch pipe uote. For more 

10 dynamic signals, a greater range of difference- values Is required, but at less precision, So, 
for each, group of five differentia! values, an exponent may be sent first, using, for 
example, 3 bits, men differential values are quantized to, for example, 2-bit accuracy. 
This arrangement reduces the average worst-ease side chain data rate by about a factor of 
two. Further redaction may be obtained fey omitting the side chain data for a reference 

15 channel (since It can be derived from the other channels), as discussed above, and fey 
using, for example, arithmetic coding. Alternatively or in addition,, differential coding 
across frequency may be employed by sending, for example, differences in subbaod angle 
or amplitude. 

Whether sidecham information is sent on a frams-by-frame basis or more 
£0 frequently, it may be useful to Interpolate sidecham values across the blacks in a frame. 
Linear interpolation over tune may be employed in the warmer of the linear interpolation 
across frequency, as described below. 

One suitable Implementation of aspects of the present invention employs 
processing steps or devices that implement the respective processing steps and are 
25 functionally related as next set forth, Although the encoding and decoding steps listed 
below may each be carried out by computer software instruction sequences operating m 
the order of the below listed steps, it will fee understood thai equivalent or similar results 
may fee obtained by steps ordered m other ways, taking into account that certain quantities 
are derived from earlier ones. For example, tool fi -threaded computer software instruction 
30 sequences may be employed so that certain sequences of steps are carried out in parallel 
Alternatively, the described- steps may fee implemented as devices that perform die 
described junctions, the various devices haying fimettpnal interrelationships as described 
hereinafter. 
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Encoding 

The encoder or mending ftrnetipnmay collects frame's worth of data before it 
derives sideehain Information and dowraolxds the ftsteie's audio channels to & single 
monophotnc (mono) audio channel. By doing so, sldeishaia may be sent first 

5 to a decoder, allowing tits decoder to begin decoding immediately upon receipt oftao 
mono audio channel information. Steps of an encoding process flooding steps") may 
be described as follows. With respect to encoding steps, reference Is made to PIG . 4, 
which is m the nature of a hybrid flowchart and functional block diagram. Through Stop 
41.9, FIG. 4 shows encoding steps for one channel. Steps 420 aad. 421 apply to ah of the 
It) multiple channels dust are combined to provide a composite mono signal output. 

Step 461* t>eteet Transtesats 

a, Perform transient detection of the PCM values hi an input audio channel. 
% Set a one-bit Transient Flag True if a transient is present in any block of a frame 
for the channel. 

1 5 Comments regarding Step 40 i ; 

The Transi ent Fl ag Ibrms & portion, of the sidech&m Information and is also used 
in Step 41 1 » as described below. Although a block-rate rather than & frame-rate Transient 
Flag may farm a portion of the sidechain Information with a modest increase la bit rate, 
increasing transient information resolution to a block rate k not behoved to noticeably 

20 improve decoder perfonnsnce. However, as mentioned above, transient resolution finer 
than block rate in the decoder may improve decoder performance and this may be 
accomplished without increasing the skleobain bit rate by detecting the occurrence of 
transients in the mono composite signal received in the decoder. 

There Is one transient flag per channel per fmxa&> which, because it is derived in 

25 the time domain, necessarily applies to ad subbands within that channel. The transient 
detection may be performed in the manner similar to that employed in an AC-3 encoder 
for controlling the decision of when to switch between long and short length audio 
blocks, but with a higher sensitivity and with the Transient Flag True for any frame In 
which the Transient Flag for a bfoefcis IToe (the A£~3 encoder detects transients on a 

SO block basis), In particular, see Section 8,2.2 of the above-cited A/S2A document. The 
sensitivity of the transient detection described in Section 8.2.2 may he increased by 
adding a sensitivity factor F to an eqoatioa set forth therein. Section 8.2.2 of the A/52 A 
document Is set forth below, with the senstttvity factor added (Section 8.2.2 as reproduced 
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below is corrected to im&ea&e that fee few pass tilleris a e^eaded teqwd direct form ft 
MR filter rather than "fonrs P* M;i«-#^:piMis^-^/5M docBmeht; Section 8 .2,2 was 
correct m the earlier A/52 do^meat), Al&ofcgsfc itis not eritteal,, a sensitivity fecior of 
0.2 has hem found lb be a suitable valob hi a practical embodiment of aspects of the 
present invention, 

Alternatively, a similar transient d&e^^:tec&mqae described la U.S. Patent 
5,394,473 may be employed, Use "473 patent 'describes asspeets of the A/S2A document 
transient detector in greater detail. Both said A/52A document and said -4?3 patent a*fe 
hereby incorporated by reference in their entirety. 

As another alternative, transients may fee detected in the frequency domain rather 
than in the time domain. In that ease, Step 401 may be omitted end m alternative step 
employed la the- tre^nency-uornain as described below^ 

Step 402. Window and BFT. 

Window PCM values and eotrvest there to complex frequency values via a OFT as 

implemented by an PFT* 

Step 41)3. • Convert Complex Valaess to Magnitude and Angle, 
Convert each freqoeneyHlonvnn complex transform bin value (a 4\/b) to a 

magnitude and angle representation using standard complex manipulations: 

a. . Magnitude ™ square ^root + b~) 

b. Angle ~ aretan (b/a) 
Comments regarding, Step 403: 

Some of the following Steps use or may use, m an alternative, the energy of a 
defined as the above magnitude squared (i.e., energy ** <a s r h ). 
Step 494, Calculate Snbband Energy. 

a. Calculate the snbband energy per block by adding bin energy values within 
each snhhand (a summation across feqnency), 

b. Calculate the subband energy per feme by averaging or accumulating the 
energy In all the blocks In a fiams (an avemgmg / accumulation across time), 

e. If the coupling fremieacy of the eoobdteris below shout 1008 Hz, apply the 
snbbsad feme-averaged or ft&ho-acobmulated energy to a tune smoother that operates 
on all shbbabds below that fVe<qnency and above the -coupling frequency. 

Cammm&& regarding Step 404ssi, 
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Time smoothing to |^^d$'l£^Q^e.sM«^yn^ in low fre^amcy subba«ds may 
be useful. In order to avoid a^&^^w^ihg.'dk»:(>ntiimiti<^ betweeii bis values at snbband 
boundaries, it may be useful to apply a pmgresssve1y~<ieereasmg time smoothing from the 
lowest feeqtwKjf subband encompassing and above the coupling ^frequency (where the 
5 smoothing may have a significant effect) up through a higher frequency subbaad in which 
the time smoothing effect is measurable, foot "inaudible, although nearly audible. A 
suitable time constant for the lowest fi^mency range siibhand (where the suhband is a 
single bin if auhbands are critical bands) may be m the range of SO to i 00 milliseconds, 
for example. Pmgressiveiy-desreasiag time smoothing may continue up through a 
1 0 subband encompassing about 3 000 Ha where the time constant may be about 10 
milliseconds* tor example. 

Although a first-order smoother is suitable, the smoother may be a two-stage 
anlpother that has a variable time constant that shortens Its attack and 4ecay time in 
response to a transient (such a two-stage smoother may be a digital equivalent of the 
35 analog two-stage smoothers described in U.S. Patents 3,846,71 9 and 4,922,535, each of 
which is hereby incorporated by reference in its entirety). In other words, the steady-state 
tune constant may be scaled according to frequency and may also he variable in response 
to transients. Alternatively, such smoothing may he applied in Step 41 2.. 
Step 405.. Calculate Sum of Bin Magnitudes. 
20 & Calculate the sum per block of the bin magnitudes (Step 403) of each subband 

(a summation across frequency). 

b. Calculate the sum per frame of the bin magnitudes of each subband by 
averaging or accumulating the magnitudes of Step 405a across toe blocks in a frame (an 
averaging / accumulation across time). These sums am used to calculate an Interehaonel 
25 Angle Consistency Factor in Step 410 below. 

e. If the coupling frequency of the encoder is below about 1 000 Hz, apply the 
sofob&nd frame-averaged or fmme-acc'amnlated magnitudes to a time smoother that 
operates on all subbands below that frequency and above the coupling frequency. 

Co*tt»x6»ts regarding Step 40Set See comments regarding step 404c except that 
30 in the ease of Step 405c, the time «»oofhmg may alternatively be perfortned as part of 
Step 41.0. 

Step 406, Calculate Eelatiye Jsiterehanuel Bin Phase Angle. 



wo immtmm 



- 24 - 

Calculate, the relMhre interehaunei phase angle Of each trassstmm bin of each block 
by subtracting &&m ibe bin angle «f .$t^-^ : the;<»m«p©ad|ing bin angle of a reference 
channel (for -example* the first channel), the rekdt, as with other angle additions or 
subtractions herein, is taken modulo (*, ~js> radians by 'adding or subtracting 2?s until the 
5 remit is within fee desired range of to 

Step 407). Calculate interebnnuel Snbfeaisd Pfease Aisgie. 
For each channel, calculate a frhme-rate ainpHinde-Wd^iied average interehaunei 
phase angle for each subband as follows: 

a. For each bin, construct a complex number from the magnitude of Step 403 
10 and the relative interehaunei bin phase angle of Step 406. 

& Add the constructed complex, numbers of Step 407a acro ss each sabband (a 
summation across frequency). 

Comment regarding Step 407b; For example, if a subband has two bins and 
one of the bias Ms a complex value of 1 + jl and the other bin has a complex 
IS Mae of 2 + j2 J their complex aura k 3+J3. 

c. Average or accumulate the per block complex cumber sum for each 
subband of Step 407b across the blocks of each frame (m averaging or 
accemul at i on across time). 

d. If the coupling frequency of the encoder is below about 1000 Ha, apply the 
20 subband frame- averaged or .fmme~acc«mulated complex value to a time smoother 

that operates on all subbands below that frequency aad above the coupling 
frequency. 

Comments rsgsrdtog Step 407d: See comments regarding Step 404c except 
that in the case of Step 407d, the time smoothing may alternatively be performed 
25 as part of Steps 407e or 410. 

e. Compute the magnitude of the complex result of Step 4074 as per Step 403. 
Comment regarding Step 407ei Bus 'magnitude is used in Step 410a below, 

In the simple example given in Step 407b 5 the magnitude of 3 + |3 is square jcoot 
(9 4.24, 

30 £ Compute the angle of fee complex result as per Step 403 , 

Comments regarding Step 4071; In me simple example given In Stop 407b> 
the angle of 3 -f J3 is arclao (3/3) - 45 degrees ~ %/4 radians. This subband angle 
is sigua^dependpUly time-smoomed {see Step 413) and quantised (see Step 414) 
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to generate the Snhband Angle Control Parameter sideehain information, as 
described below. 

Step 408. Calculate Bin Speetral*Stea&snes$ /Factor 

For each bin, calculate a Bin Speetyai-Steadinei^ Factor m the range of 0 to 1 as 
5 follows: 

a. Let x m « bin magnitude of presaM Mock calculated in Stop 403, 
h. Let y„, corresponding bi« magni tude of previous block. 

c, If:x^ > y m then Bin 'Dynmmc Amplitude Factor ~ (y^/ass)*; 

d, Else if y m > x, a „ then Bin Dynamic Amplitude Factor ~ {x^ /y m )^> 
10 e. Else if y» ~ x f!t;> then Bin Spectral-Steadiness Factor ™ 1 , 

Connneni regarding Step 408; 

"Spectral steadbcas'* is a measure of the extent to which spectral components 
spectral coeMcicrsta or bin values) change over tune. A Bin SpeeirafeSteadideM 
Factor of I indicates no change over a gives time period, 
i S Alternatively, Step 408 may took at three consecutive Mocks. If the coupling 

frequency of the encoder is below about 1000 Hz, Step 408 may look at more than: three 
epesecof sve blocks. The namber of consecntivc blocks may taken into consideration vary 
with fres|oency such that the number gradually increases as the ssbhaod fequency range 
decreases. 

20 As a further alternative, bin energies may he used instead of bio magnitudes, 

As yet a further alternative,. Step 408 may employ an "event decision" detecting 
technique as described below m the comments following Step 409. 
Step 409, Compute Snhfoand Spectral-Sfcen<imes:s Factor. 
Compute a frame-rate Suhhami Spectral-Steadioess Factor on a scale of 0 to 1 by 
25 forming an ampHfnde-welghtod average of the Bin SpeotrahSieadinass Factor within each 
sahband across the blocks in a frame as follows: 

a. For each bin, calculate the product of the Bin Spectral-Steadiness .Factor of Step 
408 and the bin /magnitude of S top 403. 

b, Sum the products vdttea»..«sadi -s#fean«3-Ca::^!»natioa across frequency). 

30 c. Average or accunmlatc the sumniniioa of Step 409b in all the blocks in a frame 

(an averaging i aeauomtatiOn aoross time). 
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d. If fte coupling ir©q|U^^ftf*W-e^^^'is-B^6w about 1000 Hss, apply the 
subband fc^e-aver&ged or fi'sM^accumskted Simlmatitm to a toe smoother that 
operates 00 all subhaads below tMt fejpsacy and above the coupling frequency. 

C&mmmts regarding Step 409di See eonmsents regarding Step 404c except that 
5 m the case of Step 40951, tiers is no suitable snhseqaent step In which the time 

smoothing may alternatively bo psrfemaed. 

e. Divide ihe results of Step 40&OO£ Step .4094 «a appropriate, by the sum of the 
bm magnitudes (Stop 405} within the sabbaml, 

C»m*»ent regarding Step 409s* The multiplication by the magnitude in Step 
10 409a and the division by the sum of the ntagnltudea i« Step 409e provide amplitude 
weighting. The output of Step 408 is Independent of absolute amplitude and, If net 
amplitude weighted, may cause the output or Step 409 to bo controlled by Very small 
amplitudes, Which is undesirable. 

f. Scale the result to obtain the Subband Speetrai-Steaduiess Factor by mapping 

1 5 Ibo range from {0>5, ,1 } to {0...1 | . Tins may he done by multiplying the rssultby 2, 
suhtrsetldg I, and limiting results less than 0 to a value of 0. 

eunusncnt regarihng Step 409ft Step 409f may he useful m assvuing that a 
channel of noise results in a Sub-band Spectral -Steadiness Factor of aero. 
Comments regarding Steps 408 and 409; 
20 The goal of Steps 408 and 409 is to measure spectral steadiness — changes h* 

spectral composition over time m a subband of a channel &iteraatlvely } aspeets of an 
"event decision" sousing such as described in International Publication Number WO 
02/097792 A! (designating the flatted States) may be employed to measure spectral 
steadiness instead of the approach just described in connection wi th Steps 408 and 409. 
25 U.S., Patent Application S.N, 1 0/478,538, filed November 20, 2003 is die United States 5 
national application of the published FCT Application WO 02/097792 A! . Both the 
published POT application and the US. application as® hereby iueosporated by reference 
in their entirety. According to these ipcorj*»ratad apphcaMons. the magnitude of the 
complex FFT coefficient of each bin are calculated and normalized (largest magnitude is 
30 set to a value of one, for example). Then the magnitudes of corresponding bins (in dB) in 
oonseeutiye blocks are subtracted (ignoring signs), the differences between bins are 
summed, and, if the sum exceeds a threshold* theblpek boundary is considered to bo m 
auditory event boundary. Alternatively, changes in arnplipsde fi&m Mock to block may 
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also be; considered along with spectral magnitude changes (by looking at fee amount of 
notHtalteation required). 

If aspects o f the incorporated event^seusing applications axe employed to measure 
spectral steadiness, oonrsslisationmay aot ljereQmrei arid &e changes in spectral 
5 njagnltud© (Changes in amplitude would not be measured if nonnaiisatlon is omitted) 
preferably are considered on a subbed basis. Instead of performing Step 408 as 
• indicated above, the decibel differences m spectral magnitude between corresponding 
bias m each sub-band may be summed in accordance with the teachings of said 
applications. Then, each of tliose sums, representing the degree of spectral change from 

1 0 block to block may be sealed so that the result is a spectra! steadiness factor having a 

range from 0 to 1 , wherein a value of I indicates the highest steadiness, a change of 0 d.B 
fmm Mock to block for a given bin. A value of 0, indicating toe- lowest steadiness* may 
be assigned to decibel changes equal to or greater than a suitable amount, such as 12 dB, 
for example* These results, a Bin Spectral-Steadiness Factor, may be used by Step 4S9 in 

i'S the same 'manner thai Step 405) uses the results of Step 408 as described above, "When 
Step 409 receives a Bin Spectral- Steadiness Factor obtained by employing the just- 
described alternative event decision sensing techaio.ee., the Subband Spectral-Steadiness 
Factor of Step 409 may also be used as an indicator of a transient For example,, if the 
ratige of values produced by Step 409 is 0 to I, a transient may he considered to he 

20 present when the Suhband Spectral -Steadiness Factor is a small value, such as, for 
example, 0J f indicating substantial spectral unsteadiness. 

It. will bo appreciated that the Bin Specif si-Steadiness Factor produced by Step 
408 and by the just-described alternative to Step 49$ each inherently provide a variable 
threshold to a certain degree in that they are based on relative changes from block to 

25 block. Optionally, it may be useful to supplement such Inherency by specifically 
providing a shift in the threshold in response to, for example, multiple transients in a 
frame or a large transient among smaller transients (e.g., a loud transient coming atop 
mid- to low-level applause), in the case of the latter example, an event detector may 
initially identify each clap as an event* but a load transient: (e,g y a dram hit) may make it 

30 desirable to shift the threshold so that only the drum hit is identified as an event 

Alternatively, a randomness metric may be employed (for example, as described 
in U.S. Patent Re 36,714, which is hereby mcorporated by reference In its entirely) 
instead of a measure of specirabsteadmess oyer time. 
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Smp 4Mh Calculate l&tereli&iasei Angle Cef»sist«»cy Factor* 
For each sabbsod having mote than one bin, calculate a trama-arale Intefchaanel 
Angle Consistency Factor as fbilnws; 

a. Divide the ma|miiude of the conipk-x '®m «f Step 407e by flse-sam of the 
m»0iitwks$ of Step 405. The t^altisg ^w" Angle Comhtmsy Factor is a 
number in the range of 0 to L 

b. Calculate a correction factor: let it^ the uaaW of values across the 
subhand contributing to the two quantities in the above step (is other words, "n** is 
ma number of ia the safe-band). If a Is less than 2, let the Angle Consistency 
Factor fee 1 and go to Steps 41 1 ami 41:3, 

e, Let r *» Expected Random Variation » 1 /a. Subtract r fmm the result of the 
Step 410b. 

d< Nottnaiisso fee result of Step 410c fey dividing by (1 - r). The result has a 
■maxianatt value of 1 . Limit ihe minimum value to 0 as necessary. 
Comments regarding Step 4111: 

InterhhanneS Angle Consistency is a measure of how similar the interehannel 
phase angles are within a Ribband over a thane period, if all bin interehannel angles of 
She subband are the same, the Interehannel Angle Consistency Factor is l.Oj whereas, if 
the interehannel angles are randomly scattered,, the value approaches sero. 

The Suhband Angle Consistency Factor indi cates if there is a phantom image 
'between ihe channels, if the consistency is low, then It is desirable to deeorrelate the 
channels- A high value indicates a fused image. Image fusion is Independent of other 
signal characteristics. 

It will be noted that the Subband Angle Consistency Factor* although m angle 
parameter, is determined indirectly from two magnitudes. If the interchaanel angles are 
all the same, adding Ihe complex values and then taking the magnitude yields the same 
xesolt as taking all the magnitudes and adding them, so the quotient is L If the 
iaterehaanel angles are scattered., adding the complex values (such as adding vectors 
having different angles) results i» at least partial eanceiatlon, so the magnitude of the 
s»m is less than the sum of the magnitudes, and the quotient is less than 1. 

Following is a simple example of a snfefeand having two bins; 

Suppose that the two complex bin Values are (3 +J4) and (6 +jg>, (Same angle 
each ease: angle areian gmagfeal), so angle! ». sretsn (4/3) arsd ajjgleS '=» aretes. (B/6) ^ 
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m±m (4/3)). Adding mmpl&x. v&hie% sum= (9 + :jl2£ magnitude of which is 
squawsjroot (81*144)™ IS. 

The sum of the magnitudes is ma^ntnde of (3 * j4)^%filtiids of (6 -f jS) « S t- 
fft* 15; The quotient is therefore 15/1 5 = 1 « consistency (before- l/s nmrnaHssafioe* 
5 would also be 3 after satmaJiKatfon) ^ojmalized consistency « (1 - 0.5) / (I - 0J) ~ 1 «0)> 
If ose ptths abov« fobs has a dit&rent ang le, say ffesat second one has complex 
value (6 ~j S), which has the same magnitude, 10. The complex sum Is now (9 » j4), 
which has magnitude ©fsquarejoot (81 4- 10) » 9.85, so the quotient is 9.85 / IS ™ 0.66 
consistency before normalization). To normalize, subtract 1/n ~ 1/2, and divide by (1- 
! 0 i hi) (normalised consistency « (0.66 - 0.5) / 0 - 0.5) ~ 03.2.) 

Although the above-described technique tor determining a Snhhaod Angle 
Cmsl8t®my Factor has been fmxvd useful, its use is mot critical. Other suitable teclmiqnes 
juay he employed . For example, one could calculate a st andard deviation of angles using 
standard fbrnmlse. In any esse, it is deshahle to employ amplitude weighting to 
I $ itilshtnise the effect of small signals on the calculated consistency value. 

In addition, an alternative derivation of the Subband Angle Consistency Factor 
may nse energy (the squares of the magnitudes) instead of magnitude. This may foe 
aeooh^lisned by squaring the magnitude from Step 403 before It is applied to Steps 405 
and 407. 

10 Step 411, Berlve Subband Becorrelatson Scale Factor, 

Derives a franm-rate Deeorrelation Scale Factor for each sub-band as follows: 
a. Let x ~ feme-rate Spectral-Steadiness Factor of Step 409£ 
h. Let y » frame-rate Angle Consistency Factor of Step 41 Oe, 
c. Theft the frama-nate SnhbandDesstrglation Scale Factor * (I - x) * (I - y)> 
25 a manner between 0 ami I . 

Coaittjents regarding Step 41%% 

The Subband Deeorrelation Scale Factor is a fenetton of the spectral-steadiness of 
signal characteristics o ver time la a subband of a channel (the Spectral-Steadiness Factor) 
and the consistency in the same snbband of a channel of bin angles with respect to 
30 corresponding bins of a reference channel (the Intercbanrf Angle Consistency Factor). 
The Subband Deeorrelation Scale Factor Is nigh only if bom me%ectrahSteadi»ess 
Factor and the fotes cbannel Angle Consistency Factor are low. 
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As explained above, the OecomdatjonSede Factor controls the degree of 
envelope deconreladoh provided in me decoder. Signals that exhibit spectral steadiness 
over time preferably should net he deenrrekted by altering tfrnt envelopes, Regardless of 
is happening in other channels, m i t may *e$uit in. audible artifacts, namely wavering 
S or warbling of the signal. 

Step 412. I>er$ve Sabfeandi Aiuplhudie Sesfe Factors. 

From the sub-band frame energy values of Step 404 and .from the subband frame 
energy values of all other channels (as may be obtained by a step corresponding to Step 
404 or an equivalent thereof), derive frame-rate Snhhand Amplitude Scale Factors, .ak 
10 follows; 

a. For each sabband, sunt the energy values per frame across all input channels. 

b. Divide each shb'hand energy value per frame, (from Step 404} by the ; smn Of the 
energy values across all input channels (from Step 412a) to exeats values in the range 
of 0 to 1. 

I S e. Convert each ratio to dB> in the range o f -co to 0- 

4. Divide by the scale factor granularity, which may bo set at 1.5 dB s for example, 
change sign to .yield a non-negative valae, limit to a maximum value which may be, for 
example, 31 (te v 5-bst precision) and round to lire nearest integer to create the quantised 
value. These values are the frame-rate Subband Amplitude Scale Factors and ate 
20 conveyed as part of die aideehain information. 

e, If the coupling frequency of the encoder is below about J. 000 Hz, apply the 
subband fbime-averaged or if amenaceumukted magnitudes to a time smoother that 
operates on all subbands below (bat frequency and above the coupling - frequency. 

Comments regardlag Step 412e: See eomments regarding step 404c except that 
25 In the case of Step 4 12e, there is no suitable subsequent: step in which the time smoothing 
may alternatively be performed. 
Comments for Step 4J2r 

Although the granularity (resolution) and quami&ation precision indicated here 
have been found to be useful, they are not critical and other values may provide 
30 acceptable results. 

Alternatively, one may use amplitude instead of energy to generate the Subband. 
Amplitude Scale Factors. If using amplitude, one Would use dB~20*log(amplitude ratio), 
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else if using energy, <sm eo&vects'fa ^M&4&&iiffl&$sm»t$y ratio), where amplitude 
ratio ~ square mat (energy ratio). 

Step 413. S£gnaM>ependeni!y Tii»e Snuwtfe Ia.tereha»aoi Safebaad Phase 

Apply signal-dependent temporal smootejg to suhbarsd feme-rate mterchannol 
angles derived m Step 4Q7f"> 

a. Let v ™ SubbMd Spectral-Steadasess Factor of Step 4O0d. 

b, Let w ~ corresponding Angle Consistency Factor of Step 4i0e» 

a. Let x ™ (1 — v) * w. This is a value betvveea 0 and 1 : , which is high if the 
Spectral-Steadiness Factor 1$ low and the Angle Consistency Factor is high, 

a. Let y "= 1 — x. y is high if Spsctrai-Steadmsss Factor is high and Aug!© 
Consistency Factor Is low, 

e, iLeiz-y*** 5 , where exp is a constant, which may be™ 0/1, z is also In the 
range of 0 to i, but skewed toward 1, corresponding to a slow time constat 

f, if tbsi Transient Flag (Step 40 J) for the channel is set, set % ~ 0 } 
corresponding to a fast time constant in the presence of a transient 

g, Compute Ihn, a maximum allowable value of as, Hsu — 1 — ((X 1 * w). This 
ranges from OS if the Angle Consistency Factor is high to L0 if the Angle 
Consistency Factor Is low (0). 

h, Limit 2 by 11m as necessary: if (z > 11m) then % : ~ Ihn, 

i, Smooth the subband angle, of Step 40?f using the value of z and a running 
smoothed value of angle insmtsused for each sub-band, if A ~ angle of Step 407f 
and RSA "-'running smoothed angle value as of the previous block, and NewRSA 
is the new value of the running smoothed angle, then - ; NewRSA ~ RSA * z + A * 
(1 — z), The value of RSA is shbsequmtly set equal to NewRSA before 
processing the .following block. Mew RSA is the slgnab-dependentiy tune- 
smoothed angle output of Step 413. 

Comments regarding Step 413; 

When a transient is detested, the aubbaud angle update time constant is set to 0, 
allowing a rapid sobband angle change. This is desirable because it allows the normal 



angle update mechanism to use a range of relatively slow time constants, nnmnuaag 
Image wandering dnr log static or quasi-static signals, yet tast-eharsging ■signals are treated 
with fast time constants. 
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Although other smoothing teehniqtMss and parameters may be usable, a first-order 
smoother implementing Step 413 ks kea fmnd to fee suitable. If Implemented as a first- 
order smoother / knvpass filter, the variable "jf* corresponds to the feed-forward 
eoeSkaenfc (sometimes denoted ■ss^ile^i-jsF corresponds to the feedback 

5 eoeffMeni (sometimes denoted 

Step 414. Qwaatke &^o$$se& Ittierc&ahael Susfehaeci Phase Angles. 

Quantize the time-smoothed safoba&d iMerchannel angles derived in Step 4131 to 
obtain the Sobband Angle Control Parameter: 

a. If the value is less than <h add 2s, so that all m&» values lo he quantized are 
i 0 In the range 0 to 2%. 

b, Divide by the angle granularity (resolution),, which may he 2%/ 64 radians, 
and round to m integer. The maximum valae may he set at 63, eerrespondisg to 
6*felt quantization. 

: C*****»ewts regarding Step 414; 

IS The onantized value is treated as a aoa-aegaU've integer, m m easy way to 

quantise the angle 1$ to map it to a non-negati ve floating point number ((add 2« if less 
than S s malting the range 0 to (less mas) 2m)), scale by the grmmlarity (resoMioh), and 
round to ah integer. Similarly, de<nnad:h'.ing that integer (which cetshi otherwise he done 
with a simple table lookup), cm he accomplished by scaling by the inverse of the angle 

20 granularity factor, converting a non-negative integer to a non-negative floating point 

angle {again, range 0 to 2%), after which it can be renormahzed to the range ±x for ftsrther 
use. Although such quantization of the Sobbaod Angle Control Parameter has been found 
to be useful, such a quantization is not critical and other tjusnhzations may provide 
acceptable results. 

25 Step 415, Quantise Swbbaml »«correfe»fl*» Scale factors. 

Quantize the Sishhaad Deeerrelation Scale factors produced by Step 41 1 to, for 
example, S levels (3 bits) by multiplying by 7.49 and rounding to tire nearest integer. 
These onantized values are part of the sideelmin infonnation. 
Cerements regarding Step 4tSi 
3 0 Although such qnanti^tion of the Subbsnd Oeeojrrelation Scale Faetors has been 

found io he useful, quantisation using lite example values is not critical and other 
motivations may provide acceptable results. 
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Deqnantise the- Subband Angle ConSmi Parameters (see Step 414), to use prior to 
downmixing. 

Continent regarding Step 416: 

Use of quantised values in the encoder helps maintain synchrony between the 
S encoder and t he decoder. 

Step 417. Distribute Fr&sssfc~Sla$e Be^ha&tfeted Sohbnnd A»gi© €»»*r»t 
Parameters Across Blocks. 

In preparation for downmixing, distribute the onee~per»frame dequandsed 
Sebband Angle Control Parameters of Step 416 across time to the subtends of each block: 
10 within the frame. 

Comment regarding Step 417* 

The same frame value may be assigned to each block in the frame. Alternatively, 
it may be useful to interpolate the Suhbaad Angle Control Parameter values across the- 
Mocks in a frame. Linear interpolation over time may be employed in the manner of the 

15 limm interpolation across frequency, as described below. 

Step 41th Interpelsite block Snfahantl Angle Cohtrel Parameters to Bte < 
Distribute the block Subband Angle Control Parameters of Step 41 7 £br each 
channel across frequency to bins, preferably using linear Interpolation m described below. 
Comment regarding Step 4f 8; 

20 If linear interpolation across frequency is employed, S tep 41$ minimises phase 

angle changes from bin to bin across a suhbaod boundary, thereby aiMJipMng aliasing 
artifacts, Subband angles are calculated independently of one another, each representing 
mi average across a subband. Thus, there may fee a large change from one subbaad to die 
next. If the net angle value for a sabband is applied to all bins in the snhhand {& 

25 "rectangular 5 " subband distribution)! the entire phase change from one subband to a 
neighboring sub-band occurs between two Mm. If there is a strong signal co mponent 
there, there may be severe, possibly aodibie, aliasing. Linear interpolation spreads the 
phase angle change over all the bins in the shhbsnd, minimizing the Change between any 
pair of bins, so that, lor example, the angle at the low end of a subband mates with the 

30 angle at the high end of the subband below it, while maintaining the o verall average the 
same as the gives calculated subband .angle*.. In. other words, instead of rectangular 
subband ■distributions, the subband angle distribution may be trapesoidally shaped. 
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For sample, suppose that the lowest coupled sahhaud lias one. bin w& a saboand 
angle of 20 degrees, the next sebbaad has three bios and a subbaod angle of 40 degrees, 
and fee third sisbbaod .has five Mm and a snhharsit angle of .100 degrees. With 
interpolation, assume that the first 6m (one sdfebarid) is shifted by an angle of 20 degrees, 
5 the nat three bins (another subbajad) are sMfied by as .angle of 40 degrees and the next 
five bias (a further suhhand) are shifted by an angle of 100 degrees. In that example, 
there is a ofMegree maximum change, from Bin 4 to bin S. Wltb linear interpolation, the 
first bin sell is shifted by m angle til 20 degrees, the next 3 bins are shifted fey about 30, 
40, and 50 degrees; and the next five bias are shifted by about 67,. 83, 100, 1 1 % and 133 
10 degrees. The average subband angle shift is the same, hot the maximum Mn-4o-bin 
change is reduced to 1 7 degrees. 

Optionally, changes m amplitude from subband to sahhand, m eormeettoo with 
this and other steps described herein, such as Step 417 may also be treated in a similar 
interpelstuwe fashion. However, it may not be necessary to do so because there tends to 
1-5 he more natural continuity in amplitude from one sub-band to the next. 

Step. 4i% Apply Phase Angle Rotation to Bi» Transform Valaes for Channel. 

Apply phase angle rotation to each bin transform value as follows: 

a. Let x ~ bin angle tor this bin as calculated in Step 418. 

b, Lety--x; 

20 e- Compote a unity-magnitude complex phase rotation scale lector with 

angle y, z ~ cos (y) +j sin (yh 

A Multiply the bio value (a * yb) by a. 
Comments regarding Step 419: 

The phase angle rotation applied in the encoder is the inverse of the angle derived 
25 from the Bubbaod Angle Control Parameter. 

Phase angle adjustments, as described herein, in an encoder or encoding process 
prior to downmsxlng (Step 420) have several advantages: (I) they minimise cancellations 
of the channels that are summed to a mono composite signal, (2) (hey aaairaiw reliance 
on energy normalization (Step 421), and (3) ihey precompensate the decoder inverse 
3 0 phase angle rotation, thereby reducing aliasing. 

The phase correction factors can be applied In the encoder by subtracting each 
snbbaad phase correction value from fee angle® of eaoh transform bin value hi that 
safeband. This is ©univalent to mutdplyfeg each Complex bin value by a complex number 
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with a magnitude of 1,0 and m angle to the negative of the phase eurreetion factor. 
Note that a eos^plex number of magnitude 1, angle A Is etpi&l to cos(A)+j sia(A). This 
latter <3«a»tity is calculated mice tor each suhhand of each channel, with A ~ -phase 
correction for this sub-band, &xm multiplied- fey -each bis complex signal vala© to realize 
5 the phase shifted Ho value, 

The phase shift is circular, whieh is benign fbf continuous signals, hot may causa 
blurring of transients if different phase angles are used for different siibbands, so it may 
be desirable to employ the Transient Flag. When the Transient Flag is True;, the angle 
calculation results may be overridden,, and all subfeands in a channel may use the same 
1 0 phase correction factor such as zero or a pseudo-random value. 



Downmlx to mono by adding the corresponding complex transform bias across 
channel s to produce a mono composite channel. 
Comments regarding Step 421b 



shifted, the ■channels are sommsd» bin-by-bin, to create the mono composite audio signal . 
Sttej> 42l» Normalise. 

To avoid cancellation of isolated bins and over-emphas i s of in-phase signals, 
oormalfee the ampl itude of each bio of the mono composite channel to have substantially 
20 the same energy m the mm of the contributing energies, as fol lows; 



a. Lei: x *« the sum across channels of bin energies (i.e., the squares of the bio 
magnitudes computed in Step 403), 

b. Let: y ** energy of corresponding bin of the mono composite channel, 
calculated aa per Step 403, 

a Let z * scale factor ** sq«are_root (x/y). If x «■ 0 then y is 0 and a is set to 

L 

a. Limit a to a maximum value of, for example, 300. If a is initially greater 
than 100 (implying strong cancellation ftom downttiixiag), add an arbitrary value, 
for example, 0,01 * square^rooi (x) to the real an d imaginary parts of the mo no 




13 



In the encoder, once- the transform bins of all die channels have been phase 



30 



composite bin*, which will assure thai it is large enoogb to be uonnabaed by the 
following step. 
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Although it is generally desbaMe to use the same phase factors for both encoding 
and decoding, eves fee optimal choice of a snbband phase correction value m ay cause 
one or more audible spectral components vsithifl the subband to be cancelled during the 
encode downmix. process because the phase shifbng of step 419 is performed on a 
5 sub-band rather than a bin. basis. la this case* a different phase factor for isolated bins in 
the ©Booster may be used if it is detected thai the sum energy of such bias is much less 
than the energy mm of the individual channel him at that frequency.. It is generally not 
necessary to apply such an isolated correction factor to the decoder, inasmuch as isolated 
bins usually have ihile effect on overall image quality. 
1 0 Step 423, Assemble and Pack into Bitstream($). 

The Amplitude Scale Factors, Angle Control Parameters, Becorreiatraa Scale 
Factors., and TVmsient Flags side channel snfpnb&tson for each channel., along with the 
eommon mono composite audio are multiplexed as may be desired and packed into one or 
more bttstreanss suitable for the storage, transmission or storage and transmission msdh*nr 
15 or media. 

Cotntnest regarding Step 422 1 

The mono composite audio may be applied to a data-ret© reducing encoding 
process or device such as, for example, a perceptual encoder or to a perceptual encoder 
and m entropy coder (e.g„ arithmetic or Huffinan coder) (sometimes referred to as a 

20 **!ossless w coder) prior to packing. Also, as mentioned above, the mono composite audio 
and related sldechain information may be derived from multiple input channels only for 
audio frequencies above a certain frequency {a "coupling" frequency). In. that case, the 
audio frequencies below the coupling frequency in each of the multiple input channels 
may be stored, transmitted or stored and transmitted as discrete channels or may be 

25 combined or processed in some manner ether than as described herein* A type of such 
arrangements is set forth in the examples of FK3S. 10, I 1 and 12, described below. 
Discrete or otherwise-combined channels may also be applied to a data reducing encoding 
process or device such as s for example, a perceptual encoder or a perceptual encode* and 
an entropy encoder. The mono composite audio and the discrete multichannel .audio may 

30 ah be applied to an integrated perceptual ej?ce4ing or perceptual and entropy acceding 
process or device prior to packing. 
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Demding 

The Steps of 8 decoding process ^ {^eco^iag steps'*) may fee described as follows. 
With respect to decoding steps, retosa©» is made to Ft©. %wMeh is in fee nature of a 
hybrid flowchart andfhnetkmal block diagram. For simptMly, the figure shows the 
5 derivation of amplitude and seals factors feam sidechakt information tor one .eMoael, It 
being understood that mpKMe and seals ractors must be obtained for each channel. 

Step 581. llapaek and Peeode SMeeiia&n Informal, 

Unpack and decode (lacludlag dequaotteatson}, as necessary, the sideehaio data 
(Amplitude Scale Factors, Angle Control Parameters* Correlation Scale Factors, and 
1 O Transient Flag) for each feme of each ehaamei (one channel shown as FIG, 5). Table 
lookups may be tiaed to decode the Amplitude Scale Factors, Angle Control Parameter, 
and Depreciation Scale Facts*& 

Comment regarding Step 501: As explained above, If a reference channel is 
employed, the ssdechain data tor the reference channel may not include &e Angle Control 
1 $ Parameters and Decorrelation Scale Factors. 

Mep 502. Unpack and 2>eeodc Mono Composite Signal. 

Unpack and decode, as necessary, the mono composite signal infonm-idon to 
provide DFT coefficients for each transform bin of the mono composite signal 

Comment regarding Step $02: * 
20 Step 501 add Step 502 may be considered to be part of a single unpacking and 

decoding step. 

Step 503. distribute Angle Parameter Values Across Blocks. 

Block Subband Angle Control Parameter values are derived from the dequanliased 
frame Sub-band Angle Control Parameter values, 
25 Comment regarding Step SdS: 

Step 503 may be Implemented by distributing the same parameter value to every 
block in tbc feme. 

Step S04. Bisidfete Suhbaad Beeerrelatsen Scale Wmtm Across Blocks. 
Block Sttbhaad Deeotrelation Scale Factor vaiues are derived from the 
30 dequantlzed frame Snhbaod Deeorreiation Scale Factor values. 
Gomment regarding Step 504: 

Step S04 may be implemented by distribntiug the same scale factor value to every 
block in the Hume. 
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Step SOS* Add Fseado»Ri*!sd«i*^ 0lfset(TeehttNi*e 3)» 
In accordance with Technique 3j described above, when the Transient Flag 
indicates a haastest, add to deblock Suhbarid Angle Control Parameter provided by Step 
503 a paeudo-random offset vafcjs sealed by the 0ecoxrekt3i6» Seals Factor (the scaling 
5 may be indirect w sot forth in this Step); 

a, Let y ~ block Siibband DecoxTeiation Scale Factor, 
k Let z ~ y'- sp , where esqp is a constant for example ~ 3. z will also be to th« 
range of 0 to 1, but skewed toward (\ reflecting a bias towatd low levels of 
psexido-random variation unless (be BecmTclaikm Scale Factor value is high. 
[0 c. Let x *» a pseudo-random number between *I aod -I, chosen separately for 

each suhbarid of each block. 

d. Then the value added to the block Subhand Angle Control Parameter to add 
a pseudo -random offset vat»e according to Technique 3 i& x * pi * z. 
Comments regarding Step SOS: 
15 Although the »oa»ltaear indirect scaling of Step 505 has been found to be useihh 

it is dot critical m& other suitable soalmgs maybe employed — in particular: oilier plaes 
tor the exisonent may be employed to obtain similar results, 

When the Subfeaad Deeorrelatton Scale Factor value is 1 , a foil range of raudom 
angles lh>i» <■» to * w are added (in which case the block Snhband Angle Control 
20 Parameter values produced by Step 503 are rendered, irrelevant). As the Subbsnd 

Decorrelaiion Scale Factor value decreases toward asm, die pseudo-random angle offset 
also decreases zero> causing the output of Step 505 to move toward the Subband Angle 
Control Parameter values produced by Stop 503. 

If desired, the encoder described above may also add a scaled pseudo-random 
25 offset in accordance with Technique 3 to the angle shift applied to a channel before mono 
dowamfcdag. Doing so may improve alias cancellation in the decoder. It may also be 
beneficial for improving the syuclvronfcity of the encoder and decoder. 
Step SOfi, Linearly Interpolate Across Frequency. 

Derive bin angles horn die block suhband angles of decoder Step SIB to which 
30 pseudo-raadosB offsets may havebeep added by Step 505 when Ore Transient Flag 
indicates a (rnnsient 

Comments regarding Step 506; 

Bin angles may be derived from sabband aogles by linear Hxiexpolatioi* across 
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frequency as described above in connection with encoder Step 418. 

step m% am f$m$p^m&&M ®m&t 2). 

In accordance with Tedmsqne 2 S described above., mhen the Trsatsieat Flag does 
not Indicate a transient, for each bm> add to all the block Sobband Angle Control 
5 Parameters in a ftarne provided fey Step S03 (Step 505 operates only when the Transient 
Flag indicates & irmsiml} tt different pmifo^mfcm offset value scaled by the 
Deeorrelatson Scale Factor (the scaling may be direct as set forth herein in this step): 

a, Let y - block Snbbaed Decoirelatfen Scale Factor 

b, Let x ** a pseudo-amdom number between H and -I, duosm separately for 
1 0 eacb bin of each frame. 

c, Tbm the value added to the block bia Angle Control Parameter to add a 
■■pmd0rtm6ot& offset value according to Technique 3 i& x * pi * y. 

C«in»ie«ts regarding Step 5©?s 

Although the direct scaling of Step SO? has been found to be useful, it is sot 

15 ailieal and othsf suitable sealings may be employed. 

To jjEjj»lm<2» temporal discontinuities, the unique pseudo-random angle value for 
each bin of each channel preferably does not change with time. The pseudo-random 
angle vafnes of all the bins m a subband are sealed fey the same Subbaad Becorrelaiion 
Scale factor value,, which is updated at the frame rate. Thus* wbm the Subband 

20 DeeotTelabou Scale Factor value is i, a full range of random angles from ~% to «s are 
added (in which case block snbbaud angle values derived from She d ©quantized frame 
siibband angle values are rendered fee-levant). As the Subhead Decorrelatiou Scale 
Factor value diminishes: toward zero, the pscudo-randotn angle- offset also diminishes 
toward the Subband Angle Control Parameter Value. Unlike Step 504, the sealing in this 

25 Step 507 may be a direct function of the Subband Dceorrelatlou Seals Factor value. For 
examples, a Subbaad Deeorrelation Scale Factor value of 0.5 pmportionally reduces every 
random angle variation by 0.5, 

The seeled pseudo-random angle value may then be added to the bin angle from 
decoder Step 506, The Deeo^elabon Scale Factor value is updated once per frame. In 

30 die presence e-fa Transient Flag dor fee fra?«e s dus step is skipped, to avoid transient 
prcnoise arbfhefe. 

If desired, ibe encoder described above may also add a scaled pseude-raedooi 
offset in accordance wim Teefeniaue 2 to meangleshift applied befe inono 
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downndxing. Doing so may approve abas cajicellatiojj in .the decoder. It may also be 
beneficial fer improving the syBdhmnicity of the ehcddcf and decoder. 
Step 5(18. Normalize Am$d*t*i&e Scale Factors. 

Noinudfee Amplitude Scale Factors aeresa channels so that they sum-square to L 
5 Comment regarding Step S#£b 

For example, if two channels havs dequan&sed scale factors of -3.0 dB (~ 2 * 
granularity of 1 .5 dB) (.70795),, the sura of the squares is- 1*002. Dividing each by the 
sqmm root of i .002 *> 1 .001. yields two values of .7072 (-3.01 dB). 
Step 369. Be&sf Snbfmnd Scale Factor Levels (Optfouai). 
10 Optionally, when the Transient Flag indicates no transient, apply a slight 

additional boost to Subband Scale Factor levels, dependent, on Subhaud Decarrehmon 
Scale Factor levels: multiply each normalized Subband Amplitude Scale Factor by a 
small factoc (e-g.i 1 -t-0.2 * Sufchaod Becorrektion Scale Factor). When the Transient 
Flag is Time, skip this step. 
1 5 Cenwueai jregardiag Step 509* 

This i step may be useful because the decoder decomjisdon Step 507 may resell to 
slightly reduced levels in the final inverse fdterbaak process. 

Step 510. Distribute Swfoband Amplitude Valsses Across Bias. 
Step 510 m»y be implemented by distributing the same subband amplitude scale 
20 factor valise to every bin m the sufeband. 
Step 511* Uptfttx* 

a, For each bin of each output channel; construct a complex upnfix scale 
fetor from the amplitude of decoder Step 508 and the bin. angle of decoder 
Slop 507; (amplitude * (cos (angle) tj sin (angle)). 
25 b. For each output eh&uhel, multiply the complex memo composite bin 

value and the complex pprrn.x scale iaetor to produce the upmixed complex output 

hm value of each bin of the channel. 

Step 512. Perform Inverse UFT (Optlossel). 

Optional ly, perform an inverse DFX* traas&rnt on the bins of each output channel 
30 to yield muMehannel output PCM values. 

CoBsrnents regarding Step 513* 

A decoder according to the present mveadoh may not provide PCM outputs. In 
the case where the decoder process is employed only above a given coupling fio^neney* 
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and discrete MBCT coefficients are&ent for eaeh ehMoel below that fesqueney, as aright 
occur in |>ractscsi !mplemej7;tati<^ of th® examples of FIGS, 10, 1 1 -aad 1 2» as described 
b&l&w* it may be desirable to convertths DPT eoeffidents derived by ffes decoder 
opnuxlag Step 11 to MDCF «fe^ci^s j ,..so:&^ , ^By.caa be combined witfe. the lower 
5 frequency discrete MDCT coefficients siid reqijajjtlmi hi order to provide, for example, a 
'bitstream compatible with m eoco&tng system thai has a large number of installed users,, 
saeh as a standard AC-3 S.P/.D1F bststre&tn for application to m externa! device where an 
inverse transform may be performed. An inverse DFT tmaafomv may be applied to ones 
of the otttpttf channels to provide PCM outputs, 

1 0 ;Sfee/io« & tfth^SM&ocummt 

With Semimi&.Faett>r "F" Added 
8,2.2. Transient detection 
Transients are detected to the fulbbaadwidifa channels in order to decide when to 
switch to short length audio blocks to improve pre-echo performance. High-pass filtered 

15 versions ot the signals arc examined for an increase to energy from one sub-Meek; time^ 
aegsnent to the next. Sob -blocks are examined at different time scales, if a tra&sient is 
detected in theseeond half of an audio block in a channel that channel switches to a short 
block, A: ehanael that is block-switched uses the 045 exponent strategy. 

Ill© transient detector is used to determine when to switch from a long transform 

20 block (length 51 2), to the short block (length 236). .it operates on 5 1 2 samples for every 
audio block. This is done in two passes,. With each pass processing 256 samples. Transient 
detection is broken down into four steps; l) high -pass filtering, 2) segmentation of the 
block into sobmultiples, 3} peak amplitude detection within each sub-block segment, and 
4} threshold comparison. The transient detector outputs a flag feHssa.v|n} for each foU- 

25 bandwidth channel, which when set to "one" indicates tbe presence of a transient in the 
second half of the 512 length input block for tbe corresponding channel. 

1) High-pass filtering; Thebigh>pass filter is implemented as a cascaded 
bi qaad direct form 11 MR filter with a cutoff of 8 kHz. 

2) Block Segmentation; The block of 33e high-pass filtered samples are 
30 segmented into a Merarchical tree of levels in which level 1 represents the 256 

length block, level 2 is two segments of length l2B f and level 3 is foor segments 
of length 64, 
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3) Foal? Detection: The sample with tM lstge^fe atsgmtode is identified for 
each segment oft ©wr^r level of the hteratemeal ires. The peaks tor a single level 
are found as follows: 

PHI - maxNn}} 

5 tor si * (SI a x 0V1)/2*j}, ($13 x (ferl ) 12*§ * 1, ...{SIS k k / 2*|>- 1 

whst«; x(«) « tu& r£h sampte' In 8»e SS8 tengih block 
1 * i, 2> $ ihs Utefarenfcs* feviai mssaber 
ft * 8m sssmsiitt uymsxtf wWm tevei 1 

1 0 Note that fpoj, (Le.,- k?*0) is defined to be fee peak of fee lasi 

segment on tevet j of ihe tree calculated immediately prior to the 
em-rout tree. For example, f»pf4) ia the preceding tree is Ppp) in fee 
current tree. 

^Threshold Comparison; The first stage of fee threshold compmaior 
15 checks to see if there is significant signal level in the current block. Tills is done 

by comparing fee overall peak value HW} of fee current block to a w slleaee 
Ihrnshold" ; If F!1f*l is below this threshold then a long block is forced. The silence 
threshold value is 100/32768, The next stage of the comparator checks the relati ve 
peak levels of adjacent segments on each level of the hierarchical tree. If fee peak 
20 ratio of any two adjacent segments on a particular level exceeds a pre-defined 

threshold for that, level then a flag is set to indicate the presence of a transient m 
the current 256 length Mock. The ratios are compared as follows: 

mssCPOlikj} * TB > (F * mag{P£8Kk~1)j}} £NoU> the *P ^sStMty feofcsrf 
vvber«; TQ3 is the pts*daBned thraafoe&S for tevet j, defined as; 
2$ T£i$~Jt 

rpj » MB 

If this inequality is toe tor say two segment peaks on any level, 
then a transient is indicated for the first half of the 5 ! 2 length input, block. 
30 The second pass fiirongh this process determines the presence of transient's 

in fee second half of the 512 length Input block* 
Dowtunlxing Applications 
The Oowumixing described ahovCj which Is an aspect of fee present Invention, t's 
nse&i so many .situations at which h is desired to reduce the oamber of channels of a 
35 multichannel audio signal. In saeh situations, some Or all of the channels of content sue 
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combined or mixed. As described above, channel combming may cause coupling 
cancellation artifacts. The aboye-d«ge*sbed downmMng provides f&t the combining of 
channels wife reduced or inaudible artifacts. 

The mono composite audio signal output of the exemplary embodiment of FIG, I 
5 (s he«piem\%Mlomain mpmsematk>nj may be passed through an myerse fttterbank if it i s 
desired to provide a. time-domain representation. In either ease, the mono composite 
output signal $$ an improved combination of me input channel signals. Whether the input 
and output signals are time- or &e*piency~domam representations is. not important 

One application of downmixiog according to aspects of the present invention is 

1 0 the playback of .5.1 channel content m a motor vehicle. Motor vehicles may reproduce 
Only four chassis of 5.1 channel content, corresponding approximately to the Left, 
Right, Left Surround and. Right Surround channels of such a system. Each channel is 
directed to one or more loudspeakers located In positions deemed suitable far 
reproduction of directional information associated, with the particular charms!. However, 

iS motor veMeles: usually do -not have a center loudspeaker position for reproduction of the 
Center ebansel in such a SJ playback system. To accommodate this situation, it is 
known to attenuate the Center channel signal (by 3 dB or 6 dB, for example) and to . 
combine It wftls each of the Left and Right channel signals to provide a phantom center 
channel. However, such simple combining leads to artifacts previously described. 

Instead of applying a simple combining, dowmnixing according to aspects of the 
present invention may he applied. For example, the arrangement of FIG, I may he 
applied twice, once for combining the Left and Centos signals, and once for combining 
Center and Right signals, in such a case, in which the cbwnntixlng Is employed in a 
reproduction environment, it Is, of course, not necessary for the audio analysers 12 and 14 

IS of FIG. I to produce any sideeham information. However, It may stilthe beneficial to 
attenuate the Center channel signal by, for example, 3 dB or 6 dB {6 dB may be more 
appropriate than 3 dB i n the near-field spaee of a motor vehicle Interior) before 
combining It with each of the Left Channel and Eight Channels signals m thai acoustics! 
power output ton the Center channel signal is approximately the same m it would be if 

10 presented through a dedicated Center chamse! speaker. Furthermore, it may he beneficial 
to denote the Center signal as the relbrenee oiiannei when .combining it with each of the 
Left Channel and' Right Channel signals such that the Rotate Angle (S or 10) f to which the 
Center channel signal is applied, does not alter the angles of fee Center channel but only 
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alt&rs the angles of the Lei and the Right oh&Tjnssl sigsials, Conseguea&y,, the' 

Center channel signal would not fe© asgle adjusted differently in each o f the two 
stanmations {L&„ the tusfo channel puss Cehter ^saBSiSl sigoals summation mi the Right 
chaottd plus Center channel signals summation},, thus easming that the phantom Center 
5 eha»sel image smmm stable. 

Another applieatidn of the downmrdug according to aspects of the present 
invention. Is in the playback of multichannel audio In a cinema (motion picture theater). 
Standards tinder development for the next generation of digital cinema systems ■requite 
the delivery of Bp to, and soon to be. mom than. Id channel* of audio. The majority of 

10 installed cinema systems only provide 5. 1 playback orprssesolatlon channels {as is well 
Imewn, the "QLP* represents the low frequency "effects" channel), 11sere:fere, until the 
playback systems are upgraded, at significant expense, there is the need to downmix 
content widvmore than 5>1 channels to SJ channels. Such downmhdng or combining of 
channels leads to artifacts as discussed above. 

I S Theiefhr^ if £ channels are to be dowamixed to Q channels (where P > Q) the 

downmsxing according to aspects of the present inveation (<&g„ as in the exemplary 
embodiment of FIG, 1, but with no reipnremeni to provide sidechain information signals) 
may be applied to obtain one or more of the Q output channels in which each such output 
Ohasrnel is to a combination of two or more of respective ones of the F input channels. If 

20 an Input channel is combined into more than one output channel, it may be advantageous, 
to denote such a channel as a reference channel, such that die Rotate Angle in FIG. 1 does 
hot alter the angles of such m input channel differently for each output channel into 
which it Is combined. 

Aspects of the present invention are not l imited to N S i encoding as described In 

25 connection with FIG. :i . More generally, aspects of the invention are applicable to the 
transformation of any number of input channels (n input channels) to any number of 
output channels (m output channels) In the manner of YXQ, 6 (i.e., N:M encoding),. 
Because In many common applications the manner of input channels n is greater than the 
number of output channels m } the W;M erscodhsg arrarsgeancBt of FIG. 6 will be referred 

30 to as "dewnmixing" for convenience In description. 

Referring to the details of FIfh d f instead of summing the outputs of rotate angle 8 
and rotate angle 10 in the additive ^ combiner & as In the arrangement of FIO. 1, those 
outputs may he applied to & dowmahr matrix device or hmction 6\ Bowamix matrix & 
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may be either a passive matrix that provides a simple summaden to one- channel, as in the 
Fhl encoding of FIG, 1, or to muitsplu channels., Matrix 6 s should have the quality that it 
provides only positive addition, 'Fhe matrix coefficients may be real or complex (real and 
imaginary). Other devices and feiietloRs iri FIG. 6. may be me same as in fee FIG, I 
5 arrangement and they bear the same reference numerals. 

&«wiix matrix 6* may p*oyide a hybrid |ieqomeyH!epeudeni function such that 
■it provides, for -example* *8»-& channels m a frequmey; range fl to £2 and it^p channels 
M a frequency range fZ to £3. For example, below a cospling fequeacy of, for sample, 
1000 the dpwnmtx matrix 6' may provide two channels and above the coupling 

10 fteguency the dovmmix maUsx 6* may provide one channel. By employing two chattels 
below the coupling frequency, better spatial fidelity maybe obtained, especially If the 
two channels represent horizontal directions (to match the homontality of the human 
ears), Such a hybrid mono/stereo arrangement is further described below in connection 
widv the examples of FIGS. 10, It and 12. 

IS Although FIG, d shows the generation of the same sidechaht infonnatson for each 

channel as in the FIG, 1 arrangement, it may be possible to osrnt certain ones of -the 
sidechaht ihfornsation when more than one channel is provided by the output of the 
dowunnx matrix 6'. In some eases, acceptable results may be obtained wben only the 
amplitude scale factor sideehsin information Is provided by the FIG, 6 arrangement 

20 Further details regarding sideehain options are discussed below in connection with the 
descriptions of PIGS. 7, 8 and 9. 

As just mentioned above, the multiple channels generated by the downnux matrix 
6 s need not be fewer than the number of input channels a. When the purpose of as 
encoder such as id FIG, 6 is to reduce the number of bits for transmission or storage, it is 

25 likely that the number of channels produced fey dowrnnix matrix 6* will be fewer tban the 
number of input channels n. However, the arsBngeuKsjit of FIG . 6 may also be used as a 
"downmixer" as described, above io conueeiioo with FIG, % hi thai: ease, there may he 
applications In which the number of channels ro psiaducen by the downmix matrix. 6' is 
more than the number of input ehanseis n, 

30 A more generalized form of die armogemeut of FIG, 2 is shown in PIG. 7, 

wherein an upmlx matrix 20 receives the I to m. channels generated by the arrangement of 
FIG, d. The opuix matrix 20 may he a passive mairix that is the conjugate transposition 
of the downmix matrix 6* of the FIG, <5 arrangement. In principle, the npnrix matrix 20 
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may be a variable matrix, or a passive matrix is> i^mfeteriosB with a -variable matrix in. 
which the variable matrix coefficients am eontnalled directly or indirectly by the 
skleehain fofonsmiioa. Other elements of FIG. 1 are as is the arrangement of FIG- 2 and 
bear the same referee numerals. 

5 Alternative .Deconelatjon 

FIGS. 8 and 9 show Variations on the generalised decoder of HO. ?. Ik particular, 
both the arrangement of FIG. S and the arrangement of -FIG, 9 show alternatives to the 
deeorrelation technique of FIGS. Zand Z, in FiG. 8, respective decollators 46 and 48 
are in lb© PCM domain* each following the respective inverse thterhank 30 and 36 in . 

I 0 their channel. In FIG. 9, respective decorrelators 50 and 52 are is the frequency domain, 
each preceding the respective inverse hherbank 30 and 36 in their ehanael in both the 
FIG. 8 and FIG. 9 arrangements, each of the decorrelators has a dniqhe characteristic as 
that their outputs are mutually deeorreksted with respect' to each other. The Deeorrelatiesn 
Scale. Factor may he used to control, for example, the ratio of deeorrehded to o»c0rrekted 

IS signal provided in each channel- Optionally, the Transient Flag may also be 'used to shiil 
the mode of operation of the decorrelaior, as is explained below.. In both the FIG, § and 
FIG. 9 arrsngemeotSj each decollator may he a Schroeder-type reverberator having Its 
own unique liter characteristic, in which the degree of reverberation is controlled by the 
decorrelaiion scale factor (implemented, for example, by controlling the degree to wMch 

20- the deeorrelator output fonns a part o f a linear combination of the deeorrelator input and 
output) , Alternatively, other controllable decorreiarion techniques may be employed 
either alone or in combination with each other or with a Schroeder-type reverberator. 
Sohroeder-type reverberators are well known and may trace their origin to two journal 
papers: '"Colorless" Artificial Reverberation" by M.R. Schroeder and B.F. Logan, H?i? 

25 Transactions on Audio, vol. AU-9, pp. 209*214* 1961 and Batumi Sounding Aiirhcial 

Reveriserafion" by M,R. B^M^JmmmtAE^ My 1962, vol. 10, no. 2, pp. 2 19-223. 

When the deeorrelators 46 and 48 operate in the PCM domain, as in the FIG. 8 
arrangement, a single Decorrelatiou Scale Factor i$ .retired. This may he obtained by 
any of several ways, For example, only a single Decollation Scale Factor may he 

30 generated in ihe encoder of FIG. 1 or FIG: 7 > Mlem&w&fy if the encoder of FIG. 1 or 
FIG, ? generates Deeerrelaison Scale Factors on a snhhand basis, the Sobband 
Decollation Scale Factors may be amplitude or power summed in the encoder of FIG. 1 
or FIG, 7 or in the decoder of FIG- 8. 
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When the decollators 30 and 52 operate in ihe f&$qvme$ domain, as in the FIG. 
9 anrogemest, they may receive a decos^latioii seale factor for each sabbaad or groups 
of subbands and, concomitantly, provide* ak&mmmt&i® degree of decollation lor such 
sahbands or groups of snbbaods, 
5 The dacorrelafers 46 aad 48 of PIG. 8 and the deeonelators 50 and 52 of FIG, 9 

may optionally receive the transient lag. In the FCM domain deamretetors of PIG. 8, the 
transient flag may be employed to shift the mode of operation of the respective 
deeorrelator. For example, the decorreiator may operate as a Scferoeder-type reverberator 
in the absence of the transient flag bat upon its receipt apd for a short subsequent time 

10 period, say I to 10 milliseconds, operate as a fixed, delay. Each channel may have a 
predetermined fixed delay or the delay may be varied in response to a plurality of 
transients within a short time period , in the frequency domai n decollators of FIG. 9, the 
transient flag may also he employed to shift the mode of operation of the respective 
desorrelator. "However, in this ease, the receipt of a transient flag may, for example, 

15 trigger a short (several milliseconds) increase io amplitude m the channel in which the 



example* xt may he acceptable to send only the amplitude scale factor, in which case the 
20 decerrektion aod angle devices or functions In the decodes- may be omitted (in that ease^ 

FIGS. S and 9 reduce to the same arrangement). 

Alternatively, only the amplitude scale factor, the deconelaiion scale iactor, and, 

optionally., the transient flag may be sent. In that ease, either the FIG. 8 or 9 

arrangemen ts would be employed (omitting the rotate angle 2 8 and 34 in each of them) 
25 because the FIG. ? airangemeot also requires the angle control parameter. 

As another alternative, only the amplitude scale factor and the angle control 

parameter may he sent In that case, either the FIG. 8 or 9 arrangements would be 

employed (omitting the deeon-efetor 46» 4$ t 50 s 52 in each of &«) because, the FIG. ? 

arrangement also requires the deeorrelation scale factor. 
30 As in FIGS. I and 2, the arrangements of FIGS. 6-9 are intended to show any 

number of input and output channels although, for simplicity in presentation, only two 

channels arc shown, 




As mentioned above, whets two or more channels are seat in addition to sMeehain 
iuiornjation, it may be acceptable to reduce the number of sitiechalu parameters. For 
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As mentioned above in corniection with ftt« desedpuon of the examples of FIGS. 
I s 2, and 6 iiuough 9„ aspects of the mventiois sre also For improving me 
performance of a tow felt rate encoding deeodtog system m which a discrete two-channel 
stereophonic ("stereo**) input audio signal, which, may have been do^nmixed from more 
5 than two channels, is encoded, such as by nereepmaf encoding, transmitted or stored, 
decoded j and reproduced m two channels as a discrete stereo audio signal below a 
coupling frequency f«> and, generally, as a monophonie ("mom" audio signal above the 
frequency f !B (in other words, there is substantially so stereo channel separation in the two 
channels at frequencies above ~~ Ihey both cany essentially the same audio 

1 0 infonnatioa). The result is what may be called a "hybrid moBo/stereo** signal By 
awiibltiing the stereo input channels at frequencies above the coupling &egu<saey $ m 
fewer bits need be traosmiltsd or stored. By employing a suitable coupling frequency, the 
reproduced hybrid mono/stereo signal may provide acceptable performance depending on 
the audio material and the perccpiiveness of the listener. As mentioned above in 

15 connection with the descripti on of the example of FIGS. 1 and t% a coupling m transition 
frequency as low as 2300 Hs or even 1 000 Ha may be suitable but that the coupling 
frequency is not critics! . Another possible choice for a coupling frequency is 4 Mils. 
■■• Other frequencies may provide a nseM balance between bit savings and listener 
acceptance and the choice of a particular coupling frequency is not critical to the 

20 invention. The coupling frequency may be variable sad, if variable, it may depend, for 
example, directly or indirectly on input signal characteristics. 

.Although such a system way provide acceptable results for most musical material 
and most listeners, it may be desirable to improve the performance of such a system 
provided that such improvements are backward compatible and do not render obsolete or 

25 unusable an installed base of "legacy'* decoders designed to receive such hybrid 
mono/stereo signals. Such Improvements may include, for example, additional 
reproduced channels, such as "surround sound" channels. Although surround sound 
channel s can he derived from a two-cltamrel s tereo signal by means o f an active matrix 
decoder, many such decoders employ wideband control circuits that operate properly only 

30 when the signals applied to them are stereo throughout: the signals 5 bandwidth ----- such 
decoders do not operate properly under some signal conditions when a hybrid 
mono/stereo signal is applied to them. 
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For example, in a 2:5 (two channels in, five channels out) matrix, decoder that 
provides cnimnels representing f«l leli ^ttt «?entey , Ibat righi, left (rear/side) sumnmd 
and tight (rear/side} surround direction oxUpnts and steers to output to the front caster 
when essentially the same signal Is sipplkid to Its inpnts, a dominant signal above the 
5 treoneney f m (hence, a mono signal in a hybrid mono/sferes system) may cause all of the 
signal components, iuctedlug those below the fre^umey Ik that may be simuKaaeousIy 
present to be reproduced by the center front output. Such matrix decoder characteristics 
sway result in sudden, signal location shifts when the dominant signal shifts from above f,» 
to below f,» or vice- versa. 

10 Examples of active matrix decoders that employ wideband control circuits include 
Dolby Fm Logic and Dolby Pro Logic II decoders, "Dolby" and "Pro Logic" are 
trademarks o f Dolby Laboratories Licensing Corporation. Aspects of Pro Logic decoders 
are disclosed in U.S. Patents 4.799,260 and 4,941,177, each of winch is incorporated by 
reference herein m its entirety. Aspects o f Pro Logic 11 decoders arc disclosed in pending 

11 U.S. Patent Application S.N. 09/532,71 1 of Fosgate, entitled Method for Deriving at 
Least Three Audio Signals from Two input Aodio Signals," Hied March 22, 2000 and 
published as WO 01/41 504 on Mm 7, 2001, and in pending 1J.S> Palent Application 
i&B6%7M of Fosgate et a$ s entitled "Method for Apparatus &r Audio Matrix Deciding," 
filed February 25, 2003 and published as DS 2004/0125960 Al on July 1, 2004, Each of 

20 said applications is incorporated by reference herein m its entirety. Some aspects of the 
operation of Dolby Pro Logic and Pro Logic II decoders are explained, for example, In 
papers assailable on the Dolby Laboratories" website (www.dolhy.coni): "Dolby 
Sorround Pro Logic Decoder Principles of Operation," by Roger Dressi er, and "Mixing 
with Dolby Pro Logic II Technology, by Jim Kilson. Other active matrix decoders are 

25 known that employ wideband control circuits and derive more than two output channels 
from a two-ehaanei stereo input. 

Aspects of the present invention arc not limited to the use of Dolby Pro Logic or 
Dolby Pro Logic II .matrix decoders,. Alternatively, the active matrix decoder may be a 
muiriband active matrix decoder such as described in Iniexnatjoaal Application 

30 PCT/US02/03619 of Davis, entitled "Atsdio Chaaael Txattslaliba^* designating the United 
States, published Aogust 1 5, 2002 as WO O2/0d392S A2 and in Interoational Application 
PCT/DS20O3/024570 of Davis, entitled "Andto Channel Spatial Translation," designating 
the United States, published March 4 2004 as WO 20O4/O19d$b A2. Each of said 
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international ajiplicatjons '^hs^fATm^^^^'&ii&itf^m W its entirety. Although, 
because of Its midribaaded control ssjdb ah active matrix decoder whm used wife a legacy 
saoao/stereo. decoder does not suffer frost me problem bfsnddCB signal location shifts 
when the dominant signal shifts froai above £», & Msw C« or viee-versa (the msMband 
5 active matrix decoder operates normally for signal components below fee frequency £,» 
whether or not there are dominant signal components above the frequency f«,.b such 
tnnittfeaaded active matrix decoders do not provide channel multiplication above the 
frequency f,» when the input is a mooo/sterco signal such as described above. 

It would be useful to augment a low bin-ate hybrid stereo/mono type encoding / 

10 , decoding system (such as the system just described or a similar system) so mat the mono 
audio information above the frequency f m is augmented so as to approximate die original 
stereo audio information, at least to the extent that the resulting augmented two-channel 
audio, when applied to an active matrix decoder:, particularly one that employs a 
wideband control circuit, causes the matrix decoder to operate substantially or morn 

15 nearly ssthongh the original wideband stereo audio information were applied to it 

As wi ll be described, aspects of the present invention may also be employed to 
improve the downouxmg to mono in a hybrid mono/stereo encoder. Such improved 
dowmaixing may be useful in improving the .reproduced output of a hybrid mono/siereo 
system whether or not the above-mentioned augmentation is employed and whether or not 

20 m active matrix, decoder is employed at the output of a hybrid inonoAstereo decoder, 
FIG. 1 Oa shows an idealised block diagram showing the principle functions or 
devices of an augmented mono/stereo encoder or encoding function according to aspects 
of the Invention. A two-channel stereo input is applied to a ladas/steteo encoder or 
encoding function 1002 ("Mono/Stereo Encoder"}, the output of which is suitable for 

25 decoding by a legacy mono/stereo decoder or decoding function. The Encoder 1002 may 
employ, for example, perceptual encoding abd provides a rnono/tereo output, for 
example, us described above. Such two-channel Input and output are each shown with 
two lines to symbolically represent the two chamois, it being understood that multiple 
channel inputs or outputs represented with multiple lines m drawings herein may be 

30 assembled and packed into a single Mtsbeam, 

Still referring, to FIG, 10a, foe two-channel stereo input is also applied to a device 
or function fperive Spatial Parameters") 1004 that derives spatial parameters 
characterising fee stereo input signals generally above fee coupling frequency £„,,. Such 
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spats si parameters may include, for example, ioterehanael amplitude, and either or both of 
uitarebarmel phase (or time) dslfbren.ee and intemhanfiel coherence {as measures Cor 
example, by peak cross-correlation). The auiexmt of dais required to carry such 
parameters may be much less than that which would have bees required to convey 
5 frequeaetes above a coupling fi^ueacf as two discrete <&sn»el$ rather than as a 
combined monophonic one. Preferably, such parameters are minimally sufficient to 
augment the hybrid moao/stere© output of a legacy decoder such that its two-channel 
characteristics above the coupling fiequeacy % are sufficient to cause atypical wideband - 
control-circuit mmm docoder to operate substantially as though the original wideband 

10 stereo audio information were applied to it Device or function 1004 generates a low- 

hitrate spatkl-parameter sideehain signal suitable for combining with thebiistreasn output 
of the encoder 1002 in a device or function {"Combiner") 1006. Preferably, tbe sideehain 
iefermation is combined so that it is earned m or with the normal hybrid motipMereo 
encoder bitsiream in such a way that the operation of a legacy mono/stereo decoder 

15 receiving such a bitstream is not affected. 

The particular manner in which such skleehaln information is carried In the 
encoder Mtstream is act critical to the invention. Many known tsclau^uss may be 
suitable; For ■■example, many encoders generate « bttstreain having massed or null hits that 
are Ignored by the decoder. An example of such an arrangement k set forth la United- 

20 Ststc$ Fateot 6,807,528 81 of Trensatv «t at, entitled ' Adding Data to a Compressed Data 
prarne " October 1.9, 2004, which patent is hereby incorporated by reference in its 
entirety. Such bits may be replaced with the sidscham information. Aaolhor example is 
that the sideehain informadon may be stega«ogtnpMcaily encoded in the encoder's 
hltstreaui. Alternatively, the stdeebata mmrmatien may be stored or transmitted 

25 separately from bitstream produced by encoder 1 002 by any technique that permits the 
transmission or storage of such intoosnstion along with a omno/stereo biixtosara 
compatible with legacy decoders. 

FIG. 1 3 a shows m idealized block: diagram showing the principle functions or 
devices of an alternative augmented sBOno/stareo ericoder or encoding function according 

3 0 to aspects of the invention,. In the PIG. f } aaltemativfc, in addition to deriving spatial- 
parameter sideehain information, die two-eteanel stereo inpnt is processed so that it is in 
better condition tor summing to mono above the coupling frequency Such processing 
may include, for example, adjuspneatof the relative phase angle above the coupling 
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fjequeucy f m between fee two hu>nf channels so as to reduce cancellation when the 
channels are summed to mono and, preferably, to avoid e^ioeltetion of isolated ftequency 
bias and over -emphasis of in-pbase signals, fey normalising the amplitude of each bin of 
the memo composite channel to have substantially the same energy as fee sum of the 
S contributing energies, Thus, FIG. 13s shows a two-ehaooel stereo japut suitable for 
application directly to a mono/stereo efccodbr or encoding Exaction I 102 ("Mono/Stereo 
Encoder' % fee output of which, m torn, is suitable for decoding by a "legacy" 
mono/stereo decoder or decoding fonetiom Encoder 1 102 may be the same devi ce or 
fimction as encoder '1002 of the FIG. 50a arrangement. Instead of being applied directly 

10 to encoder 1 J 02, fee two-channel stereo input is applied to a device or function ("Pre- 
process and Derive Spatial Parameters**} 1 100 that pre-processes the two-channel stereo 
input in order to improve the subsequent downmixing to mono above the frequency f^M 
the hybrid mooo/sieree encoder J 102 and that generates a low-bitrate spatlal-paratneter 
sidechalnfeifeuimfeon signal suitable tor comlrinlag wife the bitstreafe output of fee 

IS encoder 1 1 02 hi a device or function ("Coruhinc ,> > 1 106. Combine 1 100 may be the $smc 
device or function as Combine I Odd of the FIG, 10 arrangement. Other aspects of fee 
example of PIG. 1 1 a are the same as in example of FIG. 10a< 

Except for fee transmission or storage of the spatiai-pararueter sldechain 
snlbnuation in a manner eosnpatible with legacy decoders, the function of FIGS. 10a and 

20 1 la may be implemented, tor example, by the encoder of FIG. 6, described above* in 

wMch the downnux io block 6* is such, feat there are two channels i»m in the frequency 
range fl to f2. and one channel In the frequency range f2 to B, where 0 is the lower 
frequency limit of the encode/decode arrangement, fZ is the coupling frequency f m> and 13 
is fee upper frequency limit of fee encode/decode arrangement. 

25 It will be appreciated that device or function 1 3 00 performs two processes aod 

feat it may also be shown as two Mocks rather than oae, it will also he appreciated feat 
various devices, functions and processes shown and described in various examples herein 
may be shown combined or separated is ways other than as shown in fee figures herein. 
For example, when implemented hy computer sofbvam. instruction sequences, all of the 

30 functions of FIGS. 10a and 1 t aroay be Implemented by multithreaded software 

instruction sequences running in siutable digital signal processing hardware, la which 
case tim various devices ami functions in the examples slsowsi in the figures may 
correspond to portions of fee software festoetiosns. 



WO 2;m"/Hi<ms 



-53- 

The two-ebaoael stems Inputs is tlse exaaipies of FIGS, 10a and Ma may be 
derived from snore than two ebadnets. For example, fiv« channels representing ftont left, 
front front right, left (reWsMe) ^rrdnad and right (rear/side) snrroand directions 
jnay be dowamtxed to two stereo .(^«si3M^%'»'^Mbl&e%M2iod<^' (typically a fixed, non~ 
5 active encoder) whose encoding eharaete^sues iue chosen to be complementary to the 
decoding characteristics of an expected matrix decode. Sucb arrangements are shows in 
FIGS. 10b and 1 ?b, each of winch employs a dowranMng matrix encoder or encoding 
foncdonC^atdx; Encoder") 1001 and HSLrespectjyely. 

As mentioned above, the bitstrcsam produced by the encoding «xas»ples of FIGS, 

1 0 10a, I Ob, I la and 1 lb Is compatible with a legacy nusne/stereo decoder, Tbus, one 
example of a suitable decoding arrangement for such a bilstream is simply a legacy 
mono/stems decoder recei ving and processing the bdstream (not shown in view of its 
simplici ty). In the ease of an encoder such as in the examples of FIGS, 10a and 10b, such 
a legacy decoder will operate as though the encoder produced a bitstream intended for a 

15 legacy deeodex. In the case of an encoder such as in the examples of FIGS, 1 l a and 1 l b, 
such- a legacy decoder may operate with improved pedbnnanee in view of the pre* 
prdcessing in tWeries or device 1 1 00. Nevertheless, whether tbe bitstreatn received by a 
legacy mono/stereo decoder is produced by an arrangement as in the examples of FIGS;, 
10a, 10b, 1 l a or FIG. 1 lb, the output of scab a legacy decoder remains unsuitable fer 

10 application to an acti ve matrix decoder:, particularly one that employe a wideband control 
circuit - the output remains mono above the frequency f m because a legacy decoder does 
not recognize or use the spatial parameter sideejuun information. 

FIG. 12a shows an idealised block diagram showing the principle functions or 
devices of an augmented mono/stereo decoder or decoding function according in aspects 

>5 of the invention. A bitstream such m may he generated by an augmented encoder such as 
in the examples of FIGS, Hk/IOb or FIGS. 1 la/i lb is applied to a device or taction 
("Recover Spatial Parameters*) 1 203 that recovers tbe spatial parameter sidee-bam 
information and provides that infhrnmdon as an output Recover Spatial Parameters 1 202 
may either remove that mtbrtBattort from the Mtstreajn it receives to provide a further 

50 output that is applied to a legacy awno/stereo deemieror decoding function ("Legacy 
Decoder*) 1204 or it may apply the bitstream k receives unaltered to tbe decoder 1204 
because the legacy decoder wUl ignore the sidechabi mfoanadon. 
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The mom/simm output from Legacy Decoder 1 204 is applied to a function or 
device f Apply Spatial P$ra»i«$M?} 1206 &at apples the spatial parameter sltieehaiB 
hifonaatton recovered by device orfcsctiuu 2202 to the twrMxhannel mono/stereo output 
oftbe Legacy Decoder 1204 so that Ifee raooo andioinfcroatioo above the coupling 
5 frequency f m Is augmented so as to approximate the original stereo audio information, at 
least to the extmt that the reshhmg augmented two-channel audio, when applied to an 
active matrix decoder, causes the matrix feeder to operate substantially or mors nearly 
as though the original wideband stereo audio information were applied to it. As show?) in 
FIG. 12b, the augmented two-channel audio i»fi>rmatlos from Apply Spatial Parameters 

10 1208 may then be applied to an active matrix decoder or decoding function ("Active 

Matrix Decoder") 1208, including these that employ a wideband control circuit, so as to 
increase the number of channels. Matrix Decoder 120$ may be any of the decoders 
described above.. Typically, such matrix decoder operates at least in part in response to 
tire relative phase between the channels applied to it. Thus, the approximation of the 

15 original stereophonic audio Information includes au approximation of the relati ve pha.se 
: between the cbapsets. 

A decoder according to aspects of the invention illustrated in the examples of 
PIGS. 12a and 1.3b may be characterised m a "hybrid matrix decoder" for operating in a 
"bybriri matrix encoder/decoder system." "Hybrid" in this context refers to the fact that 

20 the decoder derives some measure of centre! information from its input audio signal and a 
further measure ofeontro! information from spatial-parameter ssdeshain information. 

As mentioned above, various devices, "functions and processes shows and 
described in various examples herein may be shown combined or separated in ways other 
than as shown In the figures. Thus, In the case of die PIGS. 1 2a and 12b examples, the 

25 Legacy Decoder 1 204 may be implemented, for example, by a legacy device or function 
in combination with other devices and feneiions m its operation may be emulated as part 
of a devise or function that also provides the recovery of arid application of spatial 
parameter functions. Similarly, the active matrix decoder may be implemented as a 
separate legacy matrix decoder device of Smction or it may be incorporated wi th other 

30 devices or :hmetions of the FIG. 1 2s and 1 2b examples. 

As mentioned above, the functions of the FIGS. I0a<'l0b and Ila/llb encoders 
may be implemented, for example, by .the encoder of FIG. b> described above when such 
an encoder provides for the transmissiou or storage of the spatial-parameter sMeehain 
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information m a manner compatible with legacy decoders, Siaalariv, the tactions of the 
FIGS. I2a/1 2b decoders may be fmplemente4 for example,, by the decoders of any one of 
the FIGS, 7, 8 and 9 examples wheti they provide fer the recovery of the spatial- 
parameter sidech&in reformation feat was transmitted or sMred m a manner compatible 
5 with legacy decoders. 

An alternative to the arrangements of FIGS* IGa/lOb, I la/S lb and I2a/12b that also may 
allow a legacy matrix decoder to operate sabstantMly or more nearly as though the 
original wideband stereo audio information were applied to it is to mid or store no 
spatial-parameter sideahain information (thus, augmented encoders such as the examples 

10 of PIGS. 1 0a/3 Ob and 1 1 a/I l b are not necessary) and to appmxuaafe a two-channel stereo 
signal above the frequency fusing {fee mono signal above that coupling frequency and 
spstial~parame?er information derived from the two-channel stereo signal below the 
coupling frequency i m , Such a decoding arrangement may he represented as hi the 
example of PIG. 1 2c in wbich a devise or reaction ("Derive Simulated Spati al 

1 S Pai^iser**) 12 If generates simulated spatial-parameter sideehain in&rmatioa from me 
Iryrferid stereophonic/ 

mdnophomc output of Legacy Decoder 1204 ~ it recovers simulated spatial parameter 
iatbrniation for the single-channel monophonie audio signal above the frequency f»£mm 
the discrete two-ehatmei audio signal below the frequency i mt and applies fee ahmdated 

20 spatial parameter information to die decoded audio signal (Apply Spatial Parameters 
1206) so that the augmented monophonie audio information above fee coupling 
frequency f m approximates the original stereophonic aodio infonnation. 

Ericoders as described in connection with the examples of PIGS. lOa' l Ob and 
1 1 a/1 lb may also include their own local decoder or decoding function, such as a decoder 

25 described in the example of FIG- 1 2a or a portion thereof {including quantisation);, m 
order to determine if die two-channel mono/stereo signal and the sidechain information, 
when decoded by such a decoder, would provide suitable results. The results of such a 
detmuination could be used to improve feeparaxnetsrs by employing, for example, a 
recursive process. In a block encoding and decoding system, as described above, 

30 recursion calculations could be performed, for example, on every block before die next 
block ends in order to minimize tbe delay in bansnuttmg a block of mono/stereo two- 
channel audio and its associated spatial parameters. 
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As a$ra«gemeat m which the encoder e&ij^te- of FK3S. l-Oa/lOb m& I Wi th 
also mclade feck ovw decades- <r4bo(^s^:^^&'.eo\^| : |i!£a» be employed 
advantageously when spati al par^neters are stored or are sent only for certain blocks 
rather that) all blocks as m the alternative to the ; -decoder of FIGS. 12a/ 1 2b, described 
5 above. lfunsu5taMe decoding would result from not sending spaiiaS-paratneter sidechain 
information, such sidechain mibxmation wonld be sent for the particular block, to. this 
esse, the decoder would be a further modification of the decoder or decoding function of 
FIGS, i 2a/l2b in that the Recover Spatial Parameters 1202 wonM have both the ability to 
recover spatial-parameter sidechain inffemmtion for fequeoeies above the e©rapHng 
10 frequency from the incoming hststream bat also to generate simulated spatial- 
parameter sldeehain information from the stereo information below the coupling 
frequency f m > 

In a simplified alternative to the !oea1-deco4e?-ineotporating encoder examples' of 
FIGS, KWlOh and 1 la/1 lb, rather than having* local decoder or decoder renelitHt, the 

15 encoder coul d simply check to determine *£ there- were a minimmu signal cortieot below 
fee coupling :he«pe«cy f m (determined in any suitable way, for example, a sum of the 
energy m frequency ferns through the frequency range), and, if not, it would send or store 
s|satial-pan«ueicr sidechain information rather fears not doing so if the energy were above 
the threshold. Depending on the encoding scheme, low signal information below the 

20 coupling frequency f w may also result in more bits being available for sending sidech&m 
information. In other words, one :may net send or may limit the spatial-parameter 
sidechain information stored or sent if there is usable signal (as compared to a threshold, 
tor sample) below the coupling frequency because one may derive a useful simulation of 
the skleehain information by analyzing the signal below the coupling frequency and 

25 assuming -that it also applies above the coupling frequency. A "usable signal" threshold 
may be determined in various ways feat are not critical to the invention. For example, a 
threshold may be based on a combination of absolute signal level and the relative levels 
of bek>w-coup.hng-trcq«eney signal and above-conplmg-hcqik^ncy signal 

Implementation 

30 The invention may be implemented: in hardware or software, or a combination of 

both (eg... programmable logic arrays). Unless otherwises specified, the processes 
included as part of fee invention are not inherently related to any partibnlar computer or 
other apparatus. In particular, various generabpotpose machlrses may be used wife 
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programs written in accordance with me teaxihmg$ herein* or it may be more comveafeai 
to construct more sp^Mfaasd apparatus {e+g.* integrated dre^iis) to perform the .required 
method steps. Thus, the invention may bs implemented in one or -more computer 
programs executing on one or mare programmable computer systems each cornpristog at 
least one processor, at .least one data storage system (metering volatile and non-volatile 
misery arid/or storage elements), Kw* o»* input device or port, and at least one, output 
device or port. Program code Is applied to input data to perform the amotions described 
herein arid generate output information. The puipid iateaiioa is applied to one or more 
output devices, in known fashion. 

.Each such program may be implemeoted in any desired computer language 
{including machine, assembly, or high level procedural, logical, ©r object oriented 
programming languages) to communicate with a computer system. In any case, the 
language may he a compi led or interpreted language, 

Each such computer program is preferably stored on or downloaded to a storage 
media or device (e.g., solid state memory or media, or magnetic ©r optica! media) 
madable by a genera! or special purpose programmable computer, for configuring arid 
operating the : computer when the storage media or device is read by the eompMer system 
t© p«rfi>rm the procedures described herein. The invents ve system may also fee considered 
to be Implemented as a computer-readable storage medium, configured with a computer 
program, -where the storage medium so configured causes a computer system to operate in 
a specific and predefined manner to perform the functions described herein. 

it should be understood that implementation of other variations and modifications- 
of the invention and its various aspects wi ll be apparent to those skilled in me art, and that 
die invention is not limited by these specific embodiments described. It is therefore 
contemplated to cover by the present invention any and ail modifications, variations, or 
equivalents that tall within the true spirit and scope of the basic underlying principles 
disclosed herein. For example, some of the steps described herein may be order 
independent, and thus can be performed in sin order different from that described. 
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L A hybrid M«wpi»omcM0B^piMfitfe' audio signal encoding method, composing 
generating, m response to a discrete tvi'd^ciiannel stereophonic audio signal, m 
5 encoded hybrid stercophoate/B5»nophomd audio si^aa! in which &e audio signal is a 
discrete two-eharmei audio signal below & fcruency f ffi aad. a sittgie-chantsel nwaophoois 
audio signal above the frequency 

generating, m response to said discrete two-channel stereophonic audio signs!, 
spatial parameter mfoffisation characterising the discrete two-charme! stereophonic audio 
0 signal above the frequency fj» s and 

combining the hybrid stereophsaie/numophonie audio signal with said spatial 
parameter faform&tioa in such a manner that the resulting si goal is deeodabie both by a 
decoder configured to decode a discrete two-channel stereophonic audio signal encoded 
with the satue encoding as applied to tire hybrid steres^ijhoasc/moaophoaso audio signal 
. 5 and by a decoder eonBgpred to decode., with the use of the spatial parameter information, 
the hybrid stereophonie/monephonk. audio slgaaf 

& .A method according to claim I > wherein said generating a hybrid 
ateswpfefisc/moaopbonic audio signal includes combining the channels of the discrete 
'0 two-channel stereophonic audio signal above the frequency f BJ , the method further 

comprising preprocessing the channels of the discrete iwo-chaaae? stereophonic audio 
signal so that they are in better condition tor combining. 

3. A method according to claim 2 wherein said processing includes one or both of 
15 (a) adjusting the relative phase angle above fee feqoeney f^ between the two channels so 

as to reduce cancellation when the channels arc 'combined, and (b) normalising the 
amplitude of each bin of the mono ^ composite channel to have substantially the same 
energy as the sum of the contributing energies so sts to avoid cancellation of isolated 
freo/aency bins and over-emphssis of in-phase signal. 

10 

4. A method according to airy ens of claims 1»3 I haiher comprising 
recovering spatial parameter feibt»jation. 
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applying the spatial parameter infbsm&f ton to the hybrid 
stereophonie/nrnnophonle audio signal audio algnM so that the augmented menophonie 
audio information above tbe frequency f»> approximates the original stereophonic audio 
information, and 

5 determining the degree to which die augmented monophomc information above 

the frequency f m approximates the originm stereophonic audio mfosmaSson, aad 

wherein generating the spatial parameter information is also in response to the 
degree to which the augmented monophenic information above the. frequency f m 
approximates the original stereophonic audio information. 

10 

5. A method according to claim 4, wherein, generating spatial parameter 
inibrmahou is part of a recursive process that Includes determining the degree to whi ch 
the augmented monophonie Information above the frequency f m approximates Sic original 
stereophonic audio iafomxatlon. 

15 

6. A method according to any one of claims 1-5 farther comprising storing or 
sending the eon^bissd audio signal and spatial parameter information and wherein said 
encoded hybrid stereophonlcAuonophoaic audio signal Is encoded using a block encoding 
process and &e spatial parameter information is stored or sent for evexy block. 

20 

7. A method according to any one of claims 1-5 further comprising storing or 
seeding the combined audio signal and spatial parameter information and wherein said 
encoded hybrid stereophonleAnonophonie audio signal is encoded using a block encoding 
process and the spatial parameter information is not stored or seat for every block. 

25 

8„ A method according to any one of claims 1-5 wherein the spatial parameter 
information is not sent when the signal eoergy below the frequency f t » is above a 
threshold. 

30 % A method according to any one of skims h$ wherein said discrete two- 

channel stereophonic audio signal Is derived from a multichannel audio signal having 
more than two channels. 
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10. A meshed aeeordmg to claim 9 w&ereio said discrete tvyo-*hssmei 
steeopbomc audio signal k derived fesa.a sjslitefe«mel audio signal having, mors than 
two channels using a matrix encoder. 

5 1 1 . A mashed according &> claim 1 0 wherein said matrix (modes employs a fixed 

matrix. 

12, A hybrid sfeeoiihoni&%on(>f>bOi&: audio signal decoding method, 
comprising 

IP recovering spatial parameter information from a combination of m encoded 

hybrid steareophoMc/moftophotiic audio signal and spatial parameter information, 

decoding Use encoded hybrid stereoph««k^nojK>phonic audio signal to provide an 
audio signal that Is a discrete two-channel h&<m. & fteoneney % and a single-channel 
monophonio aodio signal above the frequency &» 

I S applying the spatial parameter information to the decoded audio signal so that fee 

augmented menophonie audio information above the coupling frequency f fa apprnxlpistes 
the original stereophonic audio Information, the approximation including an 
approximation of the relative phase between the channels, and 

deriving more than two channels from the audio approximating die original 

20 stererophonlc audio! iofonuatiom 

13, A hybrid slereophoniednonophooio audio signal decoding method* 
comprising 

decoding the encoded hybrid stereophoaic/naonopbonic audio signal to provide an 
25 audio signal that Is a discrete lwo~cb.aaaef behnv a frequency f,„ and a siagle-channei 
niooophonie audio signal above the frequency ;|» } 

recovering simulated spatial parameter m&rmatioa for the siogk>-chaeaei 
menophonie audio sig«a! above the firem^ney fl« from the discrete two-channel audio 
signs] below the frequency f mi and 
30 applying die simulated spatial parameter information to the decoded audio signal 

so that the augmented monophpnie aijdfo information above the coupling -frequency f m 
approximates the original stereophonic audio infoxmatio». 
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1 4. A method iseeoJtHag- to claim 13 &rthsr coBipyssing deriving more than two 
channels from the audio appmximatit^ th&ari^a^ i^mi^Qphsmic audio! Infdsuiadom 

15, A method according to &Mm 12 m claim 14 whereis said ;more th&a two 
channels are derived using a matrix decoder. 

16-, A method according to claim IS wherein said matrix decoder employs •» 
variable matrix , 

17, A method according to claim 1 5 or claim 16 wherein said matrix, decoder 
operates at least m pari in response to the relative phase between the channelis applied to 
It 

18. Apparatus adapted to perform the methods of my one of claims I through 17 

ip. A computer program, stored on a computer-readable medium tor causing a 
computer to perform the methods of any one of claims 1 through 1.7, 
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